SAGE & cryptology

From YobiWiki
Revision as of 07:59, 24 December 2009 by <bdi>Newacct</bdi> (talk | contribs)
Jump to navigation Jump to search

Back to Sage

Discussions

Docs

General Remarks

  • When using Python code in Sage
    • Sage uses an internal integer representation (sage.rings.integer.Integer) which differs with the int representation of Python and which can lead to errors in some circumstances.
      => cast sage integers to python integers before supplying them to a Python function in Sage
      • Use int(...) to cast a variable (ex. int(var) )
      • Use ...r for an immediate Python integer (ex. 384r)
      • more info here
      • one example: '1/2' in Python = 0, of type int (for now, it's foreseen to make it a real div, cf __future__ module, will be =0.5 of type float) but '1/2' in Sage = 1/2, of type sage.rings.rational.Rational , to solve it, either cast to int() or use operator // instead of / in Python

TODO

  • check licensing of different packages and compare them to requirements for sage
    • PyCrypto:

I've filed a bug in PyCrypto's bug tracker: https://bugs.launchpad.net/pycrypto/+bug/260130

The PyCrypto licensing status is a bit of a mess. It looks like a bunch of reference implementations were simply copied-and-pasted into the source tree, and each has its own licensing statement. I recommend looking at each source file and making a judgment for yourself. I'm slowly working on a new release of PyCrypto (I've just taken over from Andrew Kuchling). In the next release, I'll try to document things better, and fix the most obvious problems (I've already written a replacement for RIPEMD.c). However, some of the software is unattributed. I assume that most of it was written by A.M. Kuchling, but I can't be totally sure. I'll try to contact Andrew and see if he can clear things up. - Dwayne

  • analyse structure of integers and strings and see if the representation from sage is compatible with the one from python
    • Integers:
      Integers in Sage are a different type than the Python integers. Problems can occur when executing standard python code in Sage. To avoid problems: add 'r' after a number to let it be interpreted as a Python integer.
      More info:
      Google Groups: [1] [2]
      Sage Tutorial on differences caused by the Sage preparser: [3]
    • Strings:
      Strings in Sage are the same type as Python strings
  • analyse the format of objects used by different libraries and see if they are compatible
    • Almost all libraries used "binary strings" as input/output with some exceptions:
      • PyCrypto: RSA signatures are "long"
      • TLS Lite: RSA signatures are "array of bytes" (defined by the TLS Lite library)
  • write a unified API for the different libraries
  • write wrapper for internal C library
  • check keyczar, the new lib of Google, also available in Python

TODO for Phil

  • make some speed tests with psyco, claiming to run python code up to 5x faster but:
    Psyco does not support the 64-bit x86 architecture, unless you have a Python compiled in 32-bit compatibility mode. There are no plans to port Psyco to 64-bit architectures. This would be rather involved. Psyco is only being maintained, not further developed. The development efforts of the author are now focused on PyPy, which includes Psyco-like techniques.
    • so to be tried on the desktop and if efficient, for the IT box
    • see also doc of Sage
    • here is an example and I tried on the long2string() fastest implementation, it works even on so small code:
def long2string(i):
    s='0'+hex(i)[2:-1]
    return s[len(s) % 2:].decode('hex')
timeit long2string(123456789012345678901234)
#100000 loops, best of 3: 2.7 µs per loop
import psyco
psyco.full()
timeit long2string(123456789012345678901234)
#1000000 loops, best of 3: 1.44 µs per loop

Setup a subversion server and explain how to use it, for new Python code developments...

Available

sage.crypto

Sage Reference Manual
Constructions in Sage

  • sage.crypto.all
  • sage.crypto.cipher
  • sage.crypto.classical
  • sage.crypto.classical_cipher
    • hillchipher
    • substitutioncipher
    • transpositioncipher
    • vigenerecipher
  • sage.crypto.cryptosystem
  • sage.crypto.lfsr
    • Module Level Functions
      • lfsr_autocorrelation(L, p, k)
      • lfsr_connection_polynomial(s)
      • lfsr_sequence(key, fill, n)
    • Examples:
sage: F = GF(2)
sage: o = F(0)
sage: l = F(1)
sage: key = [l,o,o,l]; fill = [l,l,o,l]; n = 20
sage: s = lfsr_sequence(key,fill,n)
sage: lfsr_autocorrelation(s,15,7)
4/15
sage: lfsr_autocorrelation(s,int(15),7)
4/15
sage: F = GF(2)
sage: F
Finite Field of size 2
sage: o = F(0); l = F(1)
sage: key = [l,o,o,l]; fill = [l,l,o,l]; n = 20
sage: s = lfsr_sequence(key,fill,n); s
[1, 1, 0, 1, 0, 1, 1, 0, 0, 1, 0, 0, 0, 1, 1, 1, 1, 0, 1, 0]
sage: lfsr_connection_polynomial(s)
x^4 + x + 1
sage: berlekamp_massey(s)
x^4 + x^3 + 1
  • sage.crypto.stream_cipher [4]
    • Class: LFSRCipher
    • Class: ShrinkingGeneratorCipher
      • new(input): input = connection polynomial & initial state
      • decimating_cipher(self)
sage: FF = FiniteField(2)
sage: P.<x> = PolynomialRing(FF)
sage: LFSR = LFSRCryptosystem(FF)
sage: IS_1 = [ FF(a) for a in [0,1,0,1,0,0,0] ]
sage: e1 = LFSR((x^7 + x + 1,IS_1))
sage: IS_2 = [ FF(a) for a in [0,0,1,0,0,0,1,0,1] ]
sage: e2 = LFSR((x^9 + x^3 + 1,IS_2))
sage: E = ShrinkingGeneratorCryptosystem()
sage: e = E((e1,e2))
sage: e.decimating_cipher()
(x^9 + x^3 + 1, [0, 0, 1, 0, 0, 0, 1, 0, 1])
      • keystream_cipher(self)
...
sage: e.keystream_cipher()
(x^7 + x + 1, [0, 1, 0, 1, 0, 0, 0])
  • sage.crypto.mq
    • SBox Class: sage.crypto.mq.SBOX: see wiki.sagemath.org
      • S.difference_distribution_matrix()
      • S.maximal_difference_probability()
      • S.interpolation_polynomial()
      • S.polynomials(degree=2)
    • Small Scale Variants of the AES (SR) Polynomial System Generator: cage.crypto.mq.sr Reference Manual
    • Multivariate Polynomial Systems: sage.crypto.mq.mpolynomialsystem [5]

PyCrypto, the Python Cryptography Toolkit

Sage ships PyCrypto (new maintainer's page?) which implements many standard cryptographic algorithms.
It is not really meant for research/education/playing around but for production code but maybe something could be done to have easier access to it from within Sage.
The docstring level documentation is horrible:

sage: import Crypto.Cipher.IDEA
sage: Crypto.Cipher.IDEA?
   x.__init__(...) initializes x; see x.__class__.__doc__ for signature

Manual is available here.
-> Contains also some info on how to extend the toolkit with new algorithms[6]
A blog about PyCrypto here

PyCrypto mostly consist of C code with a Python wrapper.

INPUT/OUTPUT:

  • always "binary strings" where each character represents 8 bits
  • except for RSA signatures -> those are "long"s


  • Hash functions(Manual): MD2, MD4, MD5, RIPEMD, SHA256, SHA, HMAC
    • SHA1 example: on Sage Notebook
    • HMAC example: on Sage Notebook
    • All hash functions support the API described by PEP 247: after importing a given hashing module, call the new() function to create a new hashing object. You can now feed arbitrary strings into the object with the update() method, and can ask for the hash value at any time by calling the digest() or hexdigest() methods. The new() function can also be passed an optional string parameter that will be immediately hashed into the object's state.[7]
      Using the argument digest_size you can get the digest size but its constant.
    • MD5, SHA and HMAC are just the standard Python implementations
    • MD2, MD4, SHA256 and RIPEMD-160 are C implementations wrapped by PyCrypto
  • Block encryption algorithms(Manual): AES, ARC2, Blowfish, CAST, DES, Triple-DES, IDEA, RC5, RC2
    • AES & DES example: on Sage Notebook
    • All block cipher support the interface described in PEP 272
    • Chaining modes: ECB, CBC, CFB, PGP, OFB and CTR
    • Possibilities: Define a new cipher object after importing the module and define the key, mode (cbc,cfb,ecb or pgp) and possible IV.
      The object gives you two methods: 'encrypt()' and 'decrypt()'.
      For AES: S-Box not modifiable, LookUp Tables are being used.
  • Stream encryption algorithms: ARC4, simple XOR
  • Public-key algorithms(Manual): RSA, DSA, ElGamal, qNEW
    • RSA example: on Sage Notebook
    • Signature is Long instead of Binary String
      binascii doesn't provide long<->binary conversion[8]
      Encrypted message is a "binary string"
    • No PKCS#1 padding
      sign() in RSA.py calls decrypt() and that only does "return pow(ciphertext[0], self.d, self.n)"
      => no padding <-> TLS Lite has padding
    • n, e and d are also provided as Long
    • Public key Modules
      construct(tuple): construct( (long(n),long(e),long(d)) )
      generate(size, randfunc, progress_func=None) => public key object
    • Public key Objects
      available methods: canencrypt(), cansign(), decrypt(tuple), encrypt(string, K), hasprivate(), publickey(), sign(string, K), size(), verify(string, signature)
  • Protocols: All-or-nothing transforms, chaffing/winnowing
  • Miscellaneous:
    • Crypto.Util.number
      GCD(x,y),getPrime(N, randfunc),getRandomNumber(N, randfunc),inverse(u, v),isPrime(N)
    • Crypto.Util.randpool[9]
      The randpool module implements a strong random number generator in the RandomPool class. The internal state consists of a string of random data, which is returned as callers request it. The class keeps track of the number of bits of entropy left, and provides a function to add new random data; this data can be obtained in various ways, such as by using the variance in a user's keystroke timings.
      • Getting N random bytes:
sage: from Crypto.Util import randpool
sage: randfunc = randpool.RandomPool()
sage: randfunc.get_bytes(N)
-> returns 8-bit string consisting of N bytes
  • Some demo programs (currently all quite old and outdated)

OpenSSL

Manual (incomplete, for example: no AES documentation)
OpenSSL book on Google Books
Functionality in OpenSLL:

  • SYMMETRIC CIPHERS
blowfish(3), cast(3), des(3), idea(3), rc2(3), rc4(3), rc5(3)
Block Cipher Modes available: ECB, CBC, CFB, OFB
Padding: only PKCS7 padding available?
AES is same implementation as in pycrypto and only supports ECB and CBC (p. 175 in OpenSSL book)
  • PUBLIC KEY CRYPTOGRAPHY AND KEY AGREEMENT
dsa(3), dh(3), rsa(3)
  • CERTIFICATES
x509(3), x509v3(3)
  • AUTHENTICATION CODES, HASH FUNCTIONS
hmac(3), md2(3), md4(3), md5(3), mdc2(3), ripemd(3), sha(3)
CBC-MAC and XCBC-MAC algorithms for OpenSSL are provided here.
  • AUXILIARY FUNCTIONS
err(3), threads(3), rand(3), OPENSSL_VERSION_NUMBER(3)
  • INPUT/OUTPUT, DATA ENCODING
asn1(3), bio(3), evp(3), pem(3), pkcs7(3), pkcs12(3)
  • INTERNAL FUNCTIONS
bn(3), buffer(3), lhash(3), objects(3), stack(3),txt_db(3)


Functionality of OpenSSL in Sage is provided via the PyOpenSSL wrapper. A more complete wrapper is M2Crypto but it is not available as a package for Sage. Still have to try to import it

PyOpenSSL

http://pyopenssl.sourceforge.net/pyOpenSSL.html/

  • X509 objects
  • 509Name objects
  • X509Req objects
    X509Store objects
  • PKey objects
  • PKCS7 objects
  • PKCS12 objects
  • X509Extension objects
  • NetscapeSPKI objects


Looks like less functionality than PyCrypto => PyCrypto seems like a better candidate to adjust, else we would have to extend the PyOpenSSL wrapper AND OpenSSL itself for any wanted extended functionality.

M2Crypto

Homepage
API documentation
Oreilly: OpenSSL p. 258-266
INPUT/OUTPUT:

  • always "binary strings" where each character represents 8 bits


  • symmetric ciphers (in EVP module: AES, Blowfish, CAST5, DES, DESX, 3DES, IDEA, RC2, RC4, RC5)
    • AES example: on Sage Notebook
    • RC4 example: on Sage Notebook
      • other algo's that can be used:
        aes_128_x, aes_192_x, aes_256_x, bf_x, cast_x, des_x, desx_cbc, des_ede3(_x), des(_ede_x), idea_x, rc2_x, rc4(_40), rc5_32_16_12_x
        Where x is the chaining mode and (...) is optional
    • EVP module (from M2Crypto import EVP)= message digests, symmetric ciphers and PK algo's
    • PKCS7 padding is used
    • A Cipher and Message Digest example:
      see here
  • message digests (MD5, SHA1, RipeMD-160) (in EVP module)
  • HMAC
    • example: on Sage Notebook
    • 2 possibilities: using the HMAC class or the hmac() function
    • supports following hash functions: MD2/4/5,MDC2,SHA1,RipeMD-160
      TLS Lite allows any hash functions conform to PEP 247 to be used
    • the API is not conform to the PEP 247 specifications
      The TLS Lite implementation is conform to PEP 247
  • RSA, DSA, DH (in EVP module)
  • PKCS7 padding
  • SSL functionality to implement clients and servers.
  • HTTPS extensions to Python's httplib, urllib, and xmlrpclib.
  • Unforgeable HMAC'ing AuthCookies for web session management.
  • FTP/TLS client and server.
  • S/MIME.
  • ZServerSSL: A HTTPS server for Zope.
  • ZSmime: An S/MIME messenger for Zope.


More functionality than the PyOpenSSL wrapper, but not available as a Sage package. Importing in sage is easy.

Setup and import
sudo aptitude install openssl libssl-dev python-dev

$ python setup.py build
$ python setup.py install

sage: import sys                                        
sage: sys.path.append('/usr/lib/python2.5/site-packages')
sage: import M2Crypto

or

$ sage -python setup.py install

GnuTLS

Manual
Standard package in sage.
Mostly for certification and not for basic cryptography.

  • Support for TLS 1.1, TLS 1.0 and SSL 3.0 protocols
Since SSL 2.0 is insecure it is not supported.
TLS 1.2 is supported but disabled by default.
  • Support for TLS extensions: server name indication, max record size, opaque PRF input, etc.
  • Support for authentication using the SRP protocol.
  • Support for authentication using both X.509 certificates and OpenPGP keys.
  • Support for TLS Pre-Shared-Keys (PSK) extension.
  • Support for Inner Application (TLS/IA) extension.
  • Support for X.509 and OpenPGP certificate handling.
  • Support for X.509 Proxy Certificates (RFC 3820).
  • Supports all the strong encryption algorithms (including SHA-256/384/512), including Camellia (RFC 4132).
  • Supports compression.

Python-GnuTLS

http://pypi.python.org/pypi/python-gnutls/1.1.5
API reference built on local machine.
Same story as with OpenSSL: C-library + python wrapper

TLS Lite

Homepage
Mailing List Archive
TLS Lite hasn't been updated since February 21, 2005

INPUT/OUTPUT:

  • always "binary strings" where each character represents 8 bits
  • exception for RSA: "array of bytes" instead of "binary strings"


Not available as Sage package, but it is pure python

"TLS Lite is a free python library that implements SSL 3.0, TLS 1.0, and TLS 1.1. TLS Lite supports non-traditional authentication methods such as SRP, shared keys, and cryptoIDs in addition to X.509 certificates. TLS Lite is pure Python, however it can access OpenSSL, cryptlib, pycrypto, and GMPY for faster crypto operations. TLS Lite integrates with httplib, xmlrpclib, poplib, imaplib, smtplib, SocketServer, asyncore, and Twisted."
Pure python implementations for:

  • AES:
    • example: on Sage Notebook
    • Interesting are AES.py, Python_AES.py and rijndael.py
      • rijndael.py originally here and was ported from this java code
      • Might be possible to modify SBox but only CBC-mode available (see Python_AES.py)
    • No padding
  • RC4
  • RSA
    • example: on Sage Notebook
    • signature = PKCS1-SHA1
    • input/output is array of bytes instead of binary string
      Use tlslite.utils.keyfactory.stringToBytes and tlslite.utils.keyfactory.bytesToString to convert between array of bytes and binary string
  • TripleDES
    • no pure python implementation. API available for cryptlib, openssl(m2crypto) and pycrypto
  • HMAC
    • example: on Sage Notebook
    • supports the API for Cryptographic Hash Functions (PEP 247)
    • can use any hashing algorithm that is also conform to the PEP 247 specifications
    • source has some comments: tlslite/utils/hmac.py
  • tlslite.utils.cryptomath. ... interesting?
    base64ToBytes base64ToNumber base64ToString bytesToBase64 bytesToNumber gcd getBase64Nonce getRandomBytes getRandomNumber getRandomPrime getRandomSafePrime hashAndBase64 invMod isPrime lcm makeSieve mpiToNumber numberToBase64 numberToBytes numberToMPI numberToString numBytes powMod stringToBase64 stringToNumber

Remarks

Bug

When calling the key generation function "rsa = tlslite.utils.keyfactory.generateRSAKey(384,["python"])" it will throw an error. The TLS Lite code will also break in future versions of python (more info on the SourceForge link)
Solutions:

  • add an "r" to the number to cast it to a python Integer instead of a Sage integer.
    => rsa = tlslite.utils.keyfactory.generateRSAKey(384r,["python"])
  • fix the code of TLS Lite:
    see the bugreport on SourceForge
Importing in Sage
sage: import sys                                        
sage: sys.path.append('/usr/lib/python2.5/site-packages')
sage: import tlslite
sage: from tlslite.api import *

Python

  • conversion between "binary string" and "hexadecimal string"
    • convert hexadecimal to appropriate string input via:
      sage: "A0B1C2".decode('hex')
    • convert string output to hexadecimal via:
      sage: "\xA0\xB1\xC2".encode('hex')
  • modules: hmac, md5, sha, hashlib[10] (contains: md5(), sha1(), sha224(), sha256(), sha384(), and sha512())

Misc

  • book written on Cryptography by David Kohel, using SAGE
  • pyDes: python implementation of DES and 3DES
  • Cryptool: open source windows program for educational use of cryptography
  • LibtomCrypt: C library with lots of stuff with good documentation
  • CryptoPy: has some code to analyze SBox
    AES Sbox Analysis - a simple analysis of the AES Sbox that determines the number and size of the permutation subgroups in the transformation. Could be extended to examine any Sbox ...
  • RSA implementation in Python
  • PyRijndael
    • based on Phil Fresle's VB implementation [11] (doesn't provide more comments)
    • two high level functions:
      • EncryptData(key, data): Encrypts data using key and returns encrypted string. Uses 256 bit Rijndael cipher. Key is built from first 32 characters of password. A 32-bit message length is attached to beginning of ciphertext.
      • DecryptData(key, data)
    • example in the code:
   PlainText="Hello World" *50
   Key="Secret"
   CipherText=EncryptData(Key,PlainText)
   PlainText2=DecryptData(Key,CipherText)
   print "PT :",PlainText
   print "KY :",Key
   print "PT2:",PlainText2
  • Collection of Python Crypto stuff: here
    • PBKDF2 pure python implementation + example: here
    • Public Key algo's in pure python: here
  • in Python:
    • Python Enhancement Proposals
      • PEP 247: API for Cryptographic Hash Functions
      • PEP 272: API for Block Encryption Algorithms v1.0
      • PEP 358: The "bytes" Object (not implemented yet)
      • PEP 3137: Immutable Bytes and Mutable Buffer (not implemented yet)
    • Random generators
      • module random (present in Sage)
      • module _random (Mersenne Twister) (present in Sage)
      • os.urandom(n) (present in Sage)
      • numpy.random core random tools of NumPy (present in Sage) (also via scipy.random)
    • Resources for Python modules related to crypto:
    • Python Coding Project Ideas: M2Crypto & TLS Lite are mentioned there
  • Blockcipher API for ECB, CBC, CFB & OFB: [12]

To be looked at:

  • Python code for ripemd (PEP 247), rijndael, serpent, twofish, whirlpool (PEP 247), XTS
    • code: here
    • blog explaining TrueCrypt using that code: here
  • Python-mcrypt
    python-mcrypt is a comprehensive Python interface to the mcrypt library, which is a library providing a uniform interface to several symmetric encryption algorithms. It is intended to have a simple interface to access encryption algorithms in ofb, cbc, cfb, ecb and stream modes. The algorithms it supports are DES, 3DES, RIJNDAEL, Twofish, IDEA, GOST, CAST-256, ARCFOUR, SERPENT, SAFER+, and more.
  • on Del.icio.us

Wishes

  • General trac
  • sage.crypto: block ciphers
  • Someone needs to replace FiniteField_ext_pari with the two NTL implementations (they are much faster).
  • elliptic and hyperelliptic curves over finite fields support is rather poor
  • algebraic aspects received some attention for the cryptanalysis of symmetric cryptographic algorithms, i.e. the cryptanalyst expresses the cipher as a large set of multivariate polynomials and attempts to solve the system. The most common case over GF(2) is handled by PolyBoRi. This library is the backbone of BooleanPolynomialRing and friends. This class needs testing, documentation, extension and bugfixes. Basically someone should sit down and add all the methods of MPolynomial[Ring]_libsingular to BooleanPolynomial[Ring] which make sense, add a ton of doctests and test the hell out of the library to make sure no SIGSEGVs surprise the user.
  • the module sage.crypto.mq is also relevant for the above.
  • Univariate polynomials over GF(2) are still implemented via NTL's ZZ_pX class rather than GF2X. This should be changed. Also this ticket has a link to gf2x a very small drop in replacement C library which claimed to be 5x faster than NTL. Though, a formal vote is needed to get it into Sage.
  • At the end of the day everything boils down to linear algebra. So if you improve that, everybody wins. Sparse linear algebra mod p is still too slow (Ralf-Phillip Weinmann did some work here wrapping code from eclib), there isn o special implementation for sparse linear algebra over GF(2) (both blackbox and e.g. reduced echelon forms), dense LA over GF(2) needs Strassen multiplication/reduction, dense LA over GF(2^n) should probably get implemented.

The ideal toolbox

Cross Reference Table: wishes &lt;-&gt; availability

Sage Sandbox

PyCryptoPlus