Find Jobs
Hire Freelancers

parallel programming (Histogram)

$30-250 USD

Cerrado
Publicado hace más de 4 años

$30-250 USD

Pagado a la entrega
In this project, you will develop a complete CUDA program to compute the Histogram of the input array. You will implement the Histogram on the device GPU. After the device Histogram is invoked, your program will also compute the Histogram sequentially on the CPU, and compare that solution with the device-computed solution. If it matches, then it will print out "Test PASSED" to the screen before exiting. Assume the Histogram will have 256 bins, i.e., bin 0, bin 1, …, and bin 255. Input value i will be mapped to the bin i. Use the following pseudo code for array initialization. int *A; A=malloc(sizeof(int)*N); //N is the size int init =1325; For (i=0;i<N;i++){ init=3125*init%65537; A[i]=init %256; } Task 1 - Basic CUDA Program using global memory Develop a CUDA program with GPU threads collectively performing the histogram calculation. Use an atomic instruction to enforce one thread at a time accessing to individual locations in the global histogram array. Task 2 – CUDA program that takes advantage of shared memory In Task 1, you will find that you GPU program speedup compared to the CPU version is very limited due to the atomic access to the global histogram array. Modify the code in Task 1 to try to improve the speedup by using GPU shared memory and registers. Record your runtime with respect to different input array sizes as shown in the following table for task 1 and task 2, and compute the speed up using the GPU computation time, and the CPU computation time. I did not specify the thread block size, you might can explore different thread block size to find the best thread block size for each input array size. The thread block size of 256 is the most obvious choice. Optional: You can also include the memory transfer time between CPU and GPU in the GPU computation time (In that case, it might be fair to also include the time for matrix initialization in the CPU computation time), and re-compute the speedup. Time 131072 (128*1024) 1048576 (1024*1024) CPU computation time GPU computation time GPU memory transfer time Note that the compiling command for the CUDA program using atomic instructions should add the -arch compiler option. The following compiling command can be used to compile the source CUDA program with file name histogram.cu. nvcc [login to view URL] –o histogram -arch=sm_30
ID del proyecto: 22640222

Información sobre el proyecto

7 propuestas
Proyecto remoto
Activo hace 4 años

¿Buscas ganar dinero?

Beneficios de presentar ofertas en Freelancer

Fija tu plazo y presupuesto
Cobra por tu trabajo
Describe tu propuesta
Es gratis registrarse y presentar ofertas en los trabajos
7 freelancers están ofertando un promedio de $206 USD por este trabajo
Avatar del usuario
I have read your description and I am so interested in your project. I am confident in your project and I can finish it clearly on time. I am well experienced and skillful CUDA/OpenMP/MPI programmer. I have +5 years of experience in software developing. I have finished a lot of project like this. I ensure the best quality of your project and to keep your deadline. Please contact me kindly and let us discuss in more detail. Working with me, you will have a good experience and good friend and save more time and money. Best regards!
$120 USD en 3 días
5,0 (89 comentarios)
6,2
6,2
Avatar del usuario
Hello, I am a CUDA expert with experience in algorithm design. I have developed a lot of algorithms using CUDA and I would like to implement histogramm algorithm using CUDA. Please contact me to discuss the details and the timeline.
$300 USD en 1 día
5,0 (4 comentarios)
5,3
5,3
Avatar del usuario
Hi, There. I have plenty of experience in C++, CUDA. I have also done a similar project. Please have a chat about the project. I shall be glad to work on this project.
$180 USD en 1 día
5,0 (12 comentarios)
3,8
3,8
Avatar del usuario
Hi, I am Goerge. If you ping me, I can give you the result in an hour. Thanks.
$250 USD en 1 día
4,9 (6 comentarios)
3,8
3,8
Avatar del usuario
No problem! I have read your description carefully and very interested in your project. I am working on Desktop App with C/C++,C#,Python & Java for 7years. I think i can do it perfectly. If you hire me, you will get cool results. i can work full-time in your time zone. Best Regards
$140 USD en 7 días
0,0 (0 comentarios)
0,0
0,0
Avatar del usuario
Hi, I have about seven years of experience in C and CUDA. I have developed similar algorithms in CUDA. I have two GPU cards, Telsa and Pascal. I will be able to complete your project as per your requirements and well within time. Thanks, Ajay
$200 USD en 3 días
0,0 (0 comentarios)
0,0
0,0

Sobre este cliente

Bandera de UNITED STATES
fairborn, United States
0,0
0
Forma de pago verificada
Miembro desde abr 25, 2019

Verificación del cliente

Otros trabajos de este cliente

data structures hash table
$30-250 USD
¡Gracias! Te hemos enviado un enlace para reclamar tu crédito gratuito.
Algo salió mal al enviar tu correo electrónico. Por favor, intenta de nuevo.
Usuarios registrados Total de empleos publicados
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Cargando visualización previa
Permiso concedido para Geolocalización.
Tu sesión de acceso ha expirado y has sido desconectado. Por favor, inica sesión nuevamente.