Increase Performance with Vectorized Memory Access #226

ByLamacq · 2020-03-19T20:48:48Z

Hello,

I changed global memory access from scalar to vector.

Plateform : Ubuntu 16.04, GTX 1050ti, Cuda 10.1 (Up : Original)

Plateform : Ubuntu 18.04, RTX 2080, Cuda 10.2 (Up : Original)

Best regards,
ByLamacq

kpot87 · 2020-04-07T01:26:55Z

Have you finds anything on 2080? How much have you checked?

ByLamacq · 2020-04-20T20:13:09Z

Avez-vous trouvé quelque chose en 2080? Combien avez-vous vérifié?

I'm not looking so I can't find anything. It's just for programming challenge...

hamnaz · 2020-06-21T19:36:37Z

could you help out in add these features,
currently bitcrack running features like id stride is 100
count+stride+count+stride = 1+100+1+100 = total 202
looking update with new switch --count as define by user ( --count 200) and stride 100 ( user count is checking keys)
user-count + stride + user-count+ stride + user- count = 300+100+300+100+300 = total 1100

addons if --keyspace is 1:3000, new switch --loop --count 2000 --stride 100
user-count + stride + user-count+ stride + user- count = 2000+100+2000(its reach at end but still countin in loop from 1(startkey))+100+2000 continue loop
--count will be keys need to be check and stride
hope this feature will make bitcrack more effective and attractive
Thankx

marcelosantoto · 2021-02-12T02:52:32Z

Hello good morning, I want to know if I put several video cards on the same computer to give you an example 4 video cards, these 4 video cards when running the program would have greater power and speed or not? I await your comments.

marssystems · 2021-02-12T04:56:24Z

Yes - you will have greater power and speed.

marcelosantoto · 2021-02-12T14:45:20Z

Sí, tendrás mayor potencia y velocidad.

First of all, thank you very much for your answer and other questions and the video cards can be any model, for example gtx 1080ti 11GB, some 2 video cards and adding rx 580 8GB about 3 video cards and adding rtx 3060TI 8GB, I would have no problems or have to be all the same models and nvidia or AMD ?, I await your answer.

marssystems · 2021-02-12T14:49:46Z

Yes - they can be any Nvidia cards. I don't know about AMD cards.
I use Windows 10 and Nvidia cards with no problems.

marcelosantoto · 2021-02-12T14:57:58Z

Sí, pueden ser cualquier tarjeta Nvidia. No sé acerca de las tarjetas AMD.
Utilizo tarjetas Windows 10 y Nvidia sin problemas.

Again thank you very much for responding and I will see to incorporate more video cards then to achieve greater power and speed, I ask you, what video cards do you use? Have you tried the Nvidia GTX, RTX or QUADDRO? Which ones do you recommend using?

marssystems · 2021-02-12T15:17:30Z

I use 12 Nvidia P106-100 mining cards and 2 Nvidia Tesla K80's.

marcelosantoto · 2021-02-13T14:28:50Z

I use 12 Nvidia P106-100 mining cards and 2 Nvidia Tesla K80's.

were you lucky to use so much power and speed with Bitcarck?

marssystems · 2021-02-13T14:40:28Z

Not yet - I just started.

marcelosantoto · 2021-02-13T15:05:01Z

Utilizo 12 tarjetas de minería Nvidia P106-100 y 2 Nvidia Tesla K80.

¿Tuviste suerte de usar tanta potencia y velocidad con Bitcarck?

are they on 2 separate PCs or 1? as if it were a mining rig?

marssystems · 2021-02-13T16:53:08Z

They are on one PC - an old converted mining rig.

…e changelog file for more informations.

Uzlopak · 2021-05-22T10:52:50Z

@ByLamacq Is this a patch, which could be also applied to OpenCL? Or is this a CUDA-specific optimization?

BitCrackEvo · 2021-05-23T08:47:54Z

It's not really a patch but yes it's can be apply to Opencl. Amd gpu have also specific microcode for vector data load. So i think this change in cl code can increase performance.

Uzlopak · 2021-05-23T10:03:45Z

@BitCrackEvo
My C and C++ skills are limited. Are you skilled to implement this?

BitCrackEvo · 2021-05-24T12:20:47Z

@Uzlopak
I will try this later but there is many change to do.

Actualy, opencl read an array of structure :
typedef struct {
uint v[8];
}uint256_t;

It's not very good but OpenCl is a high language of programmation so it's depend on the implementation by the compilator...

But, you can also try to do that yourself... It's a good training to upgrade your skills.
I will try after my own project about BitCrack. Sorry.

Uzlopak · 2021-05-24T12:37:05Z

Hi @BitCrackEvo

I started to dig deeper. Very interesting. Can you Help me with this question on stack overflow?

https://stackoverflow.com/questions/67667314/transform-native-c-matrix-multiplication-to-opencl-simd-matrix-multiplication?r=SearchResults

sigkill · 2022-02-03T13:30:33Z

This boosted my Jetson Nano about 20% faster.

blm and others added 3 commits March 15, 2020 00:00

Shaping readme.md

b54f606

Change global memory access type to vectorized (from scalar)

9f4ae74

Update for SM61

b5f0fc9

ByLamacq changed the title ~~Increase Performance with Global Vectorized Memory Access~~ Increase Performance with Vectorized Memory Access Mar 19, 2020

drlorente97 approved these changes Feb 20, 2021

View reviewed changes

ByLamacq and others added 8 commits April 21, 2021 22:51

Update round fonction name to prevent "c" error with cuda library

d44220f

Little update sha256.cuh : unsigned e -> unsigned int e in roundSha256()

78a2dd8

Update to nvidia drivers 460+ and cuda 11 for linux. Many changes, se…

2325855

…e changelog file for more informations.

Merge branch 'master' of https://github.com/ByLamacq/BitCrack

1d29b44

Update visual studio config to cuda 11.3

7f0065a

Small format change : tab to 4 space

a725df8

Small format change : tab to 4 spases

d83f403

Small format change : tab to 4 spaces

4860b10

sigkill mentioned this pull request Feb 3, 2022

Makefile options for NVIDIA Jetson Nano #350

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Increase Performance with Vectorized Memory Access #226

Increase Performance with Vectorized Memory Access #226

ByLamacq commented Mar 19, 2020

kpot87 commented Apr 7, 2020

ByLamacq commented Apr 20, 2020

hamnaz commented Jun 21, 2020

marcelosantoto commented Feb 12, 2021

marssystems commented Feb 12, 2021

marcelosantoto commented Feb 12, 2021

marssystems commented Feb 12, 2021

marcelosantoto commented Feb 12, 2021

marssystems commented Feb 12, 2021

marcelosantoto commented Feb 13, 2021

marssystems commented Feb 13, 2021

marcelosantoto commented Feb 13, 2021

marssystems commented Feb 13, 2021

Uzlopak commented May 22, 2021

BitCrackEvo commented May 23, 2021

Uzlopak commented May 23, 2021

BitCrackEvo commented May 24, 2021

Uzlopak commented May 24, 2021

sigkill commented Feb 3, 2022

Increase Performance with Vectorized Memory Access #226

Are you sure you want to change the base?

Increase Performance with Vectorized Memory Access #226

Conversation

ByLamacq commented Mar 19, 2020

kpot87 commented Apr 7, 2020

ByLamacq commented Apr 20, 2020

hamnaz commented Jun 21, 2020

marcelosantoto commented Feb 12, 2021

marssystems commented Feb 12, 2021

marcelosantoto commented Feb 12, 2021

marssystems commented Feb 12, 2021

marcelosantoto commented Feb 12, 2021

marssystems commented Feb 12, 2021

marcelosantoto commented Feb 13, 2021

marssystems commented Feb 13, 2021

marcelosantoto commented Feb 13, 2021

marssystems commented Feb 13, 2021

Uzlopak commented May 22, 2021

BitCrackEvo commented May 23, 2021

Uzlopak commented May 23, 2021

BitCrackEvo commented May 24, 2021

Uzlopak commented May 24, 2021

sigkill commented Feb 3, 2022