And just like that, Hardware Offload of SSL is working

I bought a Cavium Nitrox PX CN1620 Hardware Encryption offload board on ebay for like $20, which happens to be the same encryption offload board that a Cisco ASA 5585-X uses according to Cisco’s published specs, but you know those guys, they probably put 4 of the chips on their board. My card has one chip.

But, now how to make that work in Linux? It won’t work at all in BSD I hear.

I had to compile the cryptodev-linux kernel-module from source, and then install the module. Then I had to recompile OpenSSL with some new config parameters and whala. I say whala, like it was easy, but that took me like 2 full hrs to figure out, not too easy for a Linux noob like me.

I’ll make another post with all of the details on getting it working, but for now, here is the fun part – Benchmarks!!

========================== BEFORE =================

black@Synapse:~$ openssl speed -engine padlock -evp aes-256-cbc
engine "padlock" set.
Doing aes-256-cbc for 3s on 16 size blocks: 60320931 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 64 size blocks: 21110237 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 256 size blocks: 5714973 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 1024 size blocks: 1446415 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 8192 size blocks: 183076 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 16384 size blocks: 91591 aes-256-cbc's in 2.99s
OpenSSL 1.1.0f 25 May 2017
built on: reproducible build, date unspecified
options:bn(64,64) rc4(16x,int) des(int) aes(partial) blowfish(ptr)
compiler: gcc -DDSO_DLFCN -DHAVE_DLFCN_H -DNDEBUG -DOPENSSL_THREADS -DOPENSSL_NO_STATIC_ENGINE -DOPENSSL_PIC -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DRC4_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DGHASH_ASM -DECP_NISTZ256_ASM -DPADLOCK_ASM -DPOLY1305_ASM -DOPENSSLDIR="\"/usr/lib/ssl\"" -DENGINESDIR="\"/usr/lib/x86_64-linux-gnu/engines-1.1\""
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes
aes-256-cbc 321711.63k 450351.72k 487677.70k 493709.65k 499919.53k 501881.92k

===================== after =======================

black@Synapse:~/openssl-1.1.0f$ openssl speed -engine cryptodev -evp aes-256-cbc
engine "cryptodev" set.
Doing aes-256-cbc for 3s on 16 size blocks: 3951561 aes-256-cbc's in 0.48s
Doing aes-256-cbc for 3s on 64 size blocks: 3707961 aes-256-cbc's in 0.35s
Doing aes-256-cbc for 3s on 256 size blocks: 2466504 aes-256-cbc's in 0.26s
Doing aes-256-cbc for 3s on 1024 size blocks: 1058253 aes-256-cbc's in 0.06s
Doing aes-256-cbc for 3s on 8192 size blocks: 165430 aes-256-cbc's in 0.01s
Doing aes-256-cbc for 3s on 16384 size blocks: 84118 aes-256-cbc's in 0.01s
OpenSSL 1.1.0f 25 May 2017
built on: reproducible build, date unspecified
options:bn(64,64) rc4(16x,int) des(int) aes(partial) idea(int) blowfish(ptr)
compiler: gcc -DDSO_DLFCN -DHAVE_DLFCN_H -DNDEBUG -DOPENSSL_THREADS -DOPENSSL_NO_STATIC_ENGINE -DOPENSSL_PIC -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DRC4_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DGHASH_ASM -DECP_NISTZ256_ASM -DPADLOCK_ASM -DPOLY1305_ASM -DHAVE_CRYPTODEV -DUSE_CRYPTODEV_DIGESTS -DOPENSSLDIR="\"/usr/lib/ssl\"" -DENGINESDIR="\"/usr/lib/x86_64-linux-gnu/engines-1.1\""
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes
aes-256-cbc 131718.70k 678027.15k 2428557.78k 18060851.20k 135520256.00k 137818931.20k