atom





Administrator Posts: 5,077

Threads: 227

Joined: Apr 2010 #1



Download it here:



Lots of new features and algorithms have been added, and many bugs have been fixed.



The major changes are:

Support for cracking the bcrypt and sha512crypt ($6$) algorithms.



algorithms. Support for GPU clustering across multiple LAN hosts via VCL, and an increase to support 128 GPUs.



across multiple LAN hosts via VCL, and an increase to support 128 GPUs. Added what we call a Brute-Force++ attack (see details for description).



attack (see details for description). Increased cracking performance, especially on multi-hash due to partially reversing as you know it from single-hash cracking.





Lets start with the algorithms added; in this case, the generic types :

added -m 10 = md5(pass.salt)



added -m 20 = md5(salt.pass)



added -m 30 = md5(unicode(pass).salt)



added -m 40 = md5(salt.unicode(pass))



added -m 110 = sha1(pass.salt)



added -m 120 = sha1(salt.pass)



added -m 130 = sha1(unicode(pass).salt)



added -m 140 = sha1(salt.unicode(pass))



added -m 1410 = sha256(pass.salt)



added -m 1420 = sha256(salt.pass)



added -m 1710 = sha512(pass.salt)



added -m 1720 = sha512(salt.pass)



They have been added for two reasons.



1. Because there were many requests by users to add them like here:

http://hashcat.net/forum/thread-1009.html

http://hashcat.net/forum/thread-1152.html

http://hashcat.net/forum/thread-1444.html

http://hashcat.net/forum/thread-474.html

http://hashcat.net/forum/thread-490.html

http://hashcat.net/forum/thread-574.html

http://hashcat.net/forum/thread-577.html

http://hashcat.net/forum/thread-651.html

http://hashcat.net/forum/thread-830.html

http://hashcat.net/forum/thread-833.html

http://hashcat.net/forum/thread-944.html

http://hashcat.net/forum/thread-951.html



2. By adding another feature -- that is, setting the minimum length for a salt to 0 -- you can construct your own hashing modes if you exploit the salt by putting some data into the calculation. Since we have support in oclHashcat-plus for --hex-salt, this will make your lives even easier.





Next one is the bcrypt algorithm .



Guys, there is not much to say. Just one thing: do not expect too much! This algorithm was designed to run extremly slow on GPUs. It is highly dependant on memory-lookups, and is both salted and iterated. On our hd6990, we can reach 4085/s. This isn't much, but it's still multiple times faster than on CPU.



Details here:

http://hashcat.net/forum/thread-1219.html

http://hashcat.net/forum/thread-302.html

http://hashcat.net/forum/thread-186.html





Another algorithm we added was the EPIserver algorithm . These are the hashes stored by the ASP.NET membership provider. For more detailed information about this, have a look here:



There are plans to rename this algorithm from EPIserver to something like "asp.net membership provider." For now we will stick to EPIserver, but we will certainly rename this in a later version.



There was already an interesting blog post about all this here, definitely a good read:







Last but at least, the most impressive addition is the sha512crypt algorithm , aka $6$, which is used in nearly all Linux distributions by default.



Like all crypt(3) algorithms, this is another algorithm which is designed to run slow; plus, it is based on sha512, which uses 64 bit integers. Today's AMD GPUs do not have support for native 64 bit bitwise arithmetics (except shifts), so this is another reason why this algorithm is slow.



Still, the speedup cracking sha512crypt on GPU versus CPU is much higher compared to bcrypt. My hd6990 gives an impressive 32519/s, which we are very proud of!



This algorithm was requested here:

http://hashcat.net/forum/thread-790.html

http://hashcat.net/forum/thread-736.html

http://hashcat.net/forum/thread-303.html





The partial reversing of hashes for multi-hash lists differs a bit from classic single-hash reversal, which you are already familiar with if you use oclHashcat-lite. For several reasons, it is not efficient to reverse all hashes that many steps back as in single-hash cracking, and thus we can not reach oclHashcat-lite speed. But, it can still be more efficient than just traditional early checks.



To visualize this, here made some graphs:











You can see that the less hashes you have, the more efficient it is. The curves on Nvidia are a bit sharper.



Whenever you run brute force on multiple MD4, NTLM or MD5 hashes, oclHashcat-plus will use this partial reveral technique. In theory we can port this to salted hashes as well, but multi-hash on a salted hash is a bad idea. So for now, we stick to raw and reversable algorithms.





Another nice thing that came up lately is the Virtual OpenCL Cluster Platform (VCL) project. When thorsheim and epixoip informed us about this project in this post



The overhead produced by the network agents is very low. This is one of the most important factors for a distributed solution. I made some stats on this here:









VCL is intended to be used on dedicated LANs or with High Speed Interconnects. I would not recommend clustering nodes over the Internet, as both latency and bandwidth would be an issue.



Development for VCL support is still in its infancy, but I've tested it with 22 GPUs and it worked well. Installing and configuring VCL is outside the scope of these release notes, but I plan to write a form post on this topic soon. However, there is no magic required to get VCL running on your own.



To better support VCL, we have increased the maximum number of GPUs from 16 to 128. We do not know for a fact if VCL can handle 128 GPUs, but it works with at least 22 GPUs.



Another nice thing about this is that it works around the 8-GPU limitation in AMD's drivers and Xorg. Since VCL does not require X to run, you can build giant GPU clusters this way.





Something that already was included in the newer versions of oclHashcate-lite is the support for markov-chains.



It does not matter if you do simple Brute-Force attack using -a 3 or you do a dictionary based Hybrid-Attack using either -a 6 or -a 7. This enhancement is automatically used EVERY time you use a mask.



A little background on this, as if you do not use oclHashcat-lite you might not know:



The markov-attack is a statistically based brute-force like attack, but instead of specifying a charset or a mask, we specify a file that was generated once in a previous step. It contains statistical information which is made out of an automated analysis of a given dictionary.



It can fully replace Brute-Force since it covers the full keyspace.



In Brute-Force Attack (or in Mask Attack) we can limit the keyspace by setting a smaller charset in order to reduce the attack-time. In Markov Attack we have something similar, the "threshold". All you do is to specify a number. The higher the number, the higher the threshold to add a new link between two characters on the two-level table on which the markov-attack is based on.



The background is not so important -- just remember that the lower the value, the smaller the keyspace, and thus the faster the attack is.



But if you take a close look on it, the technical correct description would be: "Brute-Force attack enhanced by per-position markov-chains built out of wordlists for statistics with the ability to use filters using a mask". OK? That required some special naming, and since it's 100% replacing Brute Force, we made it simple for ourselves and called it Brute-Force++



Here is a nice chart that visualizes the efficiency of Brute-Force++:







The original description of how this works can be found here:

http://hashcat.net/forum/thread-1291.html

http://hashcat.net/forum/thread-1285.html

http://hashcat.net/forum/thread-1265.html





Use .ptx ad .llvmir intermediate kernels - from oclHashcat-lite



The kernels are distributed in an "intermediate" format (aka IL). This format cannot be reversed to its original C code, but is still not a binary format that can be used for execution.



The JIT (just-in-time) compilers from both OpenCL and CUDA, which ship with the driver, compile the final bytecode out of the IL. This takes a few seconds per kernel, but this is a one-time operation as the bytecode is cached (CUDA does it automatically, OpenCL does not, but we add eda function that emulates CUDA's behavior.)



This has some nice advantages:

Not 32/64 bit specific



Less HDD space



Smaller .7z



Less problems with driver specific problems as we often see with Catalyst



There is no more need to release a new oclHashcat-* in case a new driver optimization has been added. Cached oclHashcat-* kernels are driver specific. If it recognizes a driver change, it will rebuild the bytecode from the IL, but using the new JIT from the new driver, resulting in driver-specific optimized bytecode.





Added Retaining GPU temperature - from oclHashcat-lite



When I started with oclHashcat-* Hardware mangement support, some people asked me for add support for fan-speed. For a long time I was not interessted in adding fan-speed code to oclHashcat-* since this is the job for the driver or some specialized controling software.



I did not change my mind completly on this, but still we have added some fan-speed controlling code. The new parameters are:



Code: --gpu-temp-disable Disable temperature and fanspeed readings and triggers

--gpu-temp-abort=NUM Abort session if GPU temperature reaches NUM degrees celsius

--gpu-temp-retain=NUM Try to retain GPU temperature at NUM degrees celsius (AMD only)

So what this does is, if the temperature configured with the new --gpu-temp-retain parameter is reached, it starts to increase the fan-speed by 1 percent each second. Thats all. In practice, this means is it enables you to enfore a very specific operating temperature for your GPUs.



Some notes:

--gpu-temp-disable you can completly disable all the temperature stuff.



--gpu-temp-retain currently only works for AMD.



--gpu-temp-abort parameter is just the renamed version of the old --gpu-watchdog.



Both parameters accept the 0 value which disables only this specific feature. This means you can step back to the old behavior by specifying --gpu-temp-retain 0.



The default for --gpu-temp-abort is still 90c.



The default for --gpu-temp-retain is 80c.





More implemented feature requestes on forum:

http://hashcat.net/forum/thread-1303.html - Increment-mode for Brute Force



http://hashcat.net/forum/thread-1065.html - OpenLDAP SSHA's Dynamic Base64 Parser



http://hashcat.net/forum/thread-1335.html - Implement command line rules for plus



http://hashcat.net/forum/thread-1263.html - Add Charset ?a



http://hashcat.net/forum/thread-1140.html - Hashcat Exit Statuses



http://hashcat.net/forum/thread-1043.html - Next Dictionary In Line



More implemented feature requestes on PM / IRC / Email:

Default-mask for -a 3 mode from oclHashcat-lite v0.10



Commandline switch --disable-potfile feature from hashcat v0.40





This new version has been tested by many beta testers on a wide variety of hardware and operating systems.



All new features were available to beta tester for several weeks. All we did for the last few weeks was perform both automated and manual tests of all features and algorithms, until all issues were 100% fixed.



We want to say a special thank-you to the following beta-testers for their massive support during development:

epixoip



blandyuk



forumhero



M@LIK



mastercracker



proinside



This is great proof of how the cracking community is working together, regardless of what team they are on.



Of course we want to say thanks to all the beta testers who helped finding bugs and suggesting things as well -- Thanks!



--

atom and matrix







Full changelog:



Code: type: feature

file: kernels

desc: added -m 10 = md5(pass.salt)



type: feature

file: kernels

desc: added -m 20 = md5(salt.pass)



type: feature

file: kernels

desc: added -m 30 = md5(unicode(pass).salt)



type: feature

file: kernels

desc: added -m 40 = md5(salt.unicode(pass))



type: feature

file: kernels

desc: added -m 110 = sha1(pass.salt)



type: feature

file: kernels

desc: added -m 120 = sha1(salt.pass)



type: feature

file: kernels

desc: added -m 130 = sha1(unicode(pass).salt)



type: feature

file: kernels

desc: added -m 140 = sha1(salt.unicode(pass))



type: feature

file: kernels

desc: added -m 141 = EPiServer 6.x

cred: thorsheim



type: feature

file: kernels

desc: added -m 1410 = sha256(pass.salt)



type: feature

file: kernels

desc: added -m 1420 = sha256(salt.pass)



type: feature

file: kernels

desc: added -m 1710 = sha512(pass.salt)



type: feature

file: kernels

desc: added -m 1720 = sha512(salt.pass)



type: feature

file: kernels

desc: added -m 1800 = sha512crypt, SHA512(Unix)



type: feature

file: kernels

desc: added -m 3200 = bcrypt



type: feature

file: kernels

desc: removed -a 4 permutation attack (use rules and combinator-attack instead)



type: feature

file: kernels

desc: added reversing kernel for multihash MD5 if running in -a 3 mode and mask < length 9



type: feature

file: kernels

desc: added reversing kernel for multihash MD4 if running in -a 3 mode and mask < length 13



type: feature

file: kernels

desc: added reversing kernel for multihash NTLM if running in -a 3 mode and mask < length 9



type: feature

file: kernels

desc: on AMD, switched from .kernel to .llvmir to reduce diskspace



type: feature

file: kernels

desc: on NV, switched from .cubin to .ptx to reduce diskspace



type: feature

file: kernels

desc: added kernel cache to avoid unnecessary recompilation

cred: m4tr1x



type: feature

file: kernels

desc: brought back support for AMD hd4xxx GPUS due to .llvmir integration



type: feature

file: kernels

desc: optimized 0x80 handling; +3.6% speed in combinator- and hybrid-attack



type: feature

file: host programs

desc: added support for Virtual OpenCL (VCL) Cluster Platform VCL 1.15

cred: epixoip



type: feature

file: host programs

desc: added support for up to 128 GPUS



type: feature

file: host programs

desc: ported markov-attack from oclHashcat-lite v0.10



type: feature

file: host programs

desc: ported increment-mode from oclHashcat-lite v0.10



type: feature

file: host programs

desc: ported default-mask from oclHashcat-lite v0.10



type: feature

file: host programs

desc: ported -j and -k single rules from oclHashcat v0.27



type: feature

file: host programs

desc: allowed zero-length salts in the generic algorithms makes it more easy to exploit them



type: feature

file: host programs

desc: added next-dictionary-in-line feature to skip inefficient dictionaries on keypress



type: feature

file: host programs

desc: implemented base64 parser that would allow for dynamic salt lengths in nsldaps



type: feature

file: host programs

desc: worked around memory allocation limit, you can load twice as much hashes in multihash



type: driver

file: kernels

desc: added support for NVidia CUDA 5.0



type: driver

file: kernels

desc: added support for AMD APP SDK v2.7



type: driver

file: host programs

desc: added support for NVidia NVML library and got rid of nvidia-smi command



type: feature

file: host programs

desc: splitted --gpu-watchdog to --gpu-temp-disable and --gpu-temp-abort



type: feature

file: host programs

desc: added --gpu-temp-retain to try retain temperature at NUM degrees celsius

cred: m4tr1x



type: feature

file: host programs

desc: worked around AMD bug in clGetDeviceInfo() CL_DEVICE_MAX_CLOCK_FREQUENCY

cred: m4tr1x



type: change

file: host program

desc: updated exit status code, see status_codes.txt for details

cred: m4tr1x



type: feature

file: host programs

desc: backported --disable-potfile feature from hashcat v0.41

cred: m4tr1x



type: feature

file: host programs

desc: add ?a to built-in charsets as ?l?u?d?s

cred: m4tr1x



type: feature

file: host programs

desc: added fan-speeds to status display



type: bug

file: host programs

desc: fixed a bug in host program for WPA/WPA2 in -a 1, -a 6 and -a 7 mode

cred: bjorn



type: bug

file: kernels

desc: fixed a bug in kernel for WPA/WPA2 on AMD VLIW architecture leading to code not found

cred: DrGeek



type: change

file: contact.txt

desc: updated contact information (moved to freenode IRC) We are proud to present oclHashcat-plus v0.09!Download it here: http://hashcat.net/oclhashcat-plus/ Lots of new features and algorithms have been added, and many bugs have been fixed.The major changes are:Lets start with the algorithms added; in this case, theThey have been added for two reasons.1. Because there were many requests by users to add them like here:2. By adding another feature -- that is,-- you can construct your own hashing modes if you exploit the salt by putting some data into the calculation. Since we have support in oclHashcat-plus for --hex-salt, this will make your lives even easier.Next one is theGuys, there is not much to say. Just one thing: do not expect too much! This algorithm was designed to run extremly slow on GPUs. It is highly dependant on memory-lookups, and is both salted and iterated. On our hd6990, we can reach 4085/s. This isn't much, but it's still multiple times faster than on CPU.Details here:Another algorithm we added was the. These are the hashes stored by the ASP.NET membership provider. For more detailed information about this, have a look here: http://hashcat.net/forum/thread-987.html There are plans to rename this algorithm from EPIserver to something like "asp.net membership provider." For now we will stick to EPIserver, but we will certainly rename this in a later version.There was already an interesting blog post about all this here, definitely a good read: http://www.troyhunt.com/2012/06/our-pass...othes.html Last but at least, the most impressive addition is the, aka $6$, which is used in nearly all Linux distributions by default.Like all crypt(3) algorithms, this is another algorithm which is designed to run slow; plus, it is based on sha512, which uses 64 bit integers. Today's AMD GPUs do not have support for native 64 bit bitwise arithmetics (except shifts), so this is another reason why this algorithm is slow.Still, the speedup cracking sha512crypt on GPU versus CPU is much higher compared to bcrypt. My hd6990 gives an impressive 32519/s, which we are very proud of!This algorithm was requested here:Theof hashes for multi-hash lists differs a bit from classic single-hash reversal, which you are already familiar with if you use oclHashcat-lite. For several reasons, it is not efficient to reverse all hashes that many steps back as in single-hash cracking, and thus we can not reach oclHashcat-lite speed. But, it can still be more efficient than just traditional early checks.To visualize this, here made some graphs:You can see that the less hashes you have, the more efficient it is. The curves on Nvidia are a bit sharper.Whenever you run brute force on multiple MD4, NTLM or MD5 hashes, oclHashcat-plus will use this partial reveral technique. In theory we can port this to salted hashes as well, but multi-hash on a salted hash is a bad idea. So for now, we stick to raw and reversable algorithms.Another nice thing that came up lately is the(VCL) project. When thorsheim and epixoip informed us about this project in this post http://hashcat.net/forum/thread-1473.html it was totally not working with oclHashcat-*, nor any other OpenCL-based password cracker. But, we got in contact with the developers at MOSIX, and after some debugging and trace sessions, we were able to pinpoint the problems. MOSIX then released VCL version 1.15 which addressed these issues.The overhead produced by the network agents is very low. This is one of the most important factors for a distributed solution. I made some stats on this here:VCL is intended to be used on dedicated LANs or with High Speed Interconnects. I would not recommend clustering nodes over the Internet, as both latency and bandwidth would be an issue.Development for VCL support is still in its infancy, but I've tested it with 22 GPUs and it worked well. Installing and configuring VCL is outside the scope of these release notes, but I plan to write a form post on this topic soon. However, there is no magic required to get VCL running on your own.To better support VCL, we have increased the maximum number of GPUs from 16 to 128. We do not know for a fact if VCL can handle 128 GPUs, but it works with at least 22 GPUs.Another nice thing about this is that it works around the 8-GPU limitation in AMD's drivers and Xorg. Since VCL does not require X to run, you can build giant GPU clusters this way.Something that already was included in the newer versions of oclHashcate-lite is the support for markov-chains.It does not matter if you do simple Brute-Force attack using -a 3 or you do a dictionary based Hybrid-Attack using either -a 6 or -a 7. This enhancement is automatically used EVERY time you use a mask.A little background on this, as if you do not use oclHashcat-lite you might not know:The markov-attack is a statistically based brute-force like attack, but instead of specifying a charset or a mask, we specify a file that was generated once in a previous step. It contains statistical information which is made out of an automated analysis of a given dictionary.It can fully replace Brute-Force since it covers the full keyspace.In Brute-Force Attack (or in Mask Attack) we can limit the keyspace by setting a smaller charset in order to reduce the attack-time. In Markov Attack we have something similar, the "threshold". All you do is to specify a number. The higher the number, the higher the threshold to add a new link between two characters on the two-level table on which the markov-attack is based on.The background is not so important -- just remember that the lower the value, the smaller the keyspace, and thus the faster the attack is.But if you take a close look on it, the technical correct description would be: "Brute-Force attack enhanced by per-position markov-chains built out of wordlists for statistics with the ability to use filters using a mask". OK? That required some special naming, and since it's 100% replacing Brute Force, we made it simple for ourselves and called itHere is a nice chart that visualizes the efficiency of Brute-Force++:The original description of how this works can be found here:Use- from oclHashcat-liteThe kernels are distributed in an "intermediate" format (aka IL). This format cannot be reversed to its original C code, but is still not a binary format that can be used for execution.The JIT (just-in-time) compilers from both OpenCL and CUDA, which ship with the driver, compile the final bytecode out of the IL. This takes a few seconds per kernel, but this is a one-time operation as the bytecode is cached (CUDA does it automatically, OpenCL does not, but we add eda function that emulates CUDA's behavior.)This has some nice advantages:Added- from oclHashcat-liteWhen I started with oclHashcat-* Hardware mangement support, some people asked me for add support for fan-speed. For a long time I was not interessted in adding fan-speed code to oclHashcat-* since this is the job for the driver or some specialized controling software.I did not change my mind completly on this, but still we have added some fan-speed controlling code. The new parameters are:So what this does is, if the temperature configured with the new --gpu-temp-retain parameter is reached, it starts to increase the fan-speed by 1 percent each second. Thats all. In practice, this means is it enables you to enfore a very specific operating temperature for your GPUs.Some notes:More implemented feature requestes on forum:More implemented feature requestes on PM / IRC / Email:This new version has been tested by many beta testers on a wide variety of hardware and operating systems.All new features were available to beta tester for several weeks. All we did for the last few weeks was perform both automated and manual tests of all features and algorithms, until all issues were 100% fixed.We want to say a special thank-you to the following beta-testers for their massive support during development:This is great proof of how the cracking community is working together, regardless of what team they are on.Of course we want to say thanks to all the beta testers who helped finding bugs and suggesting things as well -- Thanks!--atom and matrixFull changelog: Website Find M@LIK





Senior Member Posts: 414

Threads: 14

Joined: Mar 2012 #2 Great work! Good job all! Find kartan





Team Hashcat Leader Posts: 68

Threads: 3

Joined: Feb 2011 #3 fucking amazing! this would qualify for a major release! sch0.org Find forumhero





Senior Member Posts: 313

Threads: 44

Joined: Aug 2011 #4 fantastic work, everyone! Find atom





Administrator Posts: 5,077

Threads: 227

Joined: Apr 2010 #5



Building GPU-Clusters for oclHashcat with VCL v1.15: As said in the release notes, here is the howto:Building GPU-Clusters for oclHashcat with VCL v1.15: https://hashcat.net/wiki/doku.php?id=vcl_cluster_howto Website Find mastercracker





Senior Member Posts: 621

Threads: 57

Joined: May 2010 #6 (09-08-2012, 04:15 PM) atom Wrote: As said in the release notes, here is the howto:



Building GPU-Clusters for oclHashcat with VCL v1.15: https://hashcat.net/wiki/doku.php?id=vcl_cluster_howto Good wiki. It's not mentioned but I guess that you are bound with the same limitation as the OCL version which is that you need the same cards on each machine or at least the cards using the same kernel, right? Good wiki. It's not mentioned but I guess that you are bound with the same limitation as the OCL version which is that you need the same cards on each machine or at least the cards using the same kernel, right? Find atom





Administrator Posts: 5,077

Threads: 227

Joined: Apr 2010 #7 Yes, right, while my prio 1 is to enable mixed gpu types for v0.10 Website Find Mem5





Posting Freak Posts: 763

Threads: 135

Joined: Feb 2011 #8 Thanks ! great release as always ! Find forumhero





Senior Member Posts: 313

Threads: 44

Joined: Aug 2011 #9 atom, just wanted to clarify. is the master node required to be on the same highspeed LAN or can it be on wireless? Find atom





Administrator Posts: 5,077

Threads: 227

Joined: Apr 2010 #10



Should work, yes! Wireless LAN is a highspeed LAN, somewhatShould work, yes! Website Find