Whilst working on my previous post on two factor authentication I was reminded of the broad spectrum of approaches taken to security by many sites. I have one bank that does two factor authentication, another using the standard username/password combination for one factor authentication, and then another that asks for a username, a piece of personal information and then a partial password. In this case, three characters from a password.
Of these three approaches it is the last that seems the weakest. The piece of personal information includes things like the town you grew up in, the name of your first school, etc. These are not generally secret and could be discovered for most people by a bit of research. The partial password scheme seems only slightly stronger than a three character password.
I believe the partial password is intended to prevent a keylogger on your computer from compromising your entire password. This scheme is often paired with selecting characters from a dropdown menu, potentially providing additional protection. The idea is that by requesting different characters on each visit you would need to log in multiple times on a compromised computer before an attacker discovered your entire password.
I don't think such a system would be used in a new application today but I did wonder how such a scheme might be implemented.
There are several different ways we can potentially implement a partial password strategy. So that we can give some numbers lets assume during login a user is required to give three characters from an eight character password containing just lowercase letters. From worst to best they are:
The simplest option we have is storing the password as plaintext. The disadvantage is if we are ever the victim of a security breach the attacker will have a complete list of our users passwords.
Unfortunately, I suspect this is the approach that is taken on many sites using this password scheme.
Salted hash of each character
Security can be improved marginally by storing a salted hash of each character in the password. A attacker would not be able to get the plaintext passwords directly but with only 26 options for each character they could be very quickly recovered. There are only 208 (8*26) options to try in total.
Salted hash of every combination
The number of options an attacker would need to try can be increased by moving from single characters to multiple characters. By storing a salted hash for each combination of characters an attacker would need to simultaneously correctly guess three characters. This gives 17576 (262626) different options, requiring over 84 times more work by an attacker. Unfortunately, after the first three characters are known an attacker only needs 26 guesses per character for a total of 17706.
This is probably the best security we can achieve with this scheme but we need to store lots of hashes. For our three characters from an eight character password there are 56 different combinations. If using bcrypt each salted hash requires 60 bytes and if we are very efficient we can store the character positions in a single byte. Overall we therefore need (60 + 1) * 56 = 3416 bytes. Unless you are very resource constrained this is unlikely to be a problem but neither is it very efficient.
Our final option is to use a shared secret approach. This strategy has the same level of security as the salted and hashed character combinations approach but requires a lot less storage space. In this strategy the characters of the password are used as keys to encrypt parts of a shared secret. The implementation below requires:
- For each character
- A salt (16 bytes)
- A work factor (1 byte)
- A nonce / IV used to encrypt the part of the secret (16 bytes)
- An encrypted part of the secret (16 bytes)
- A salt and hash value for the reconstructed secret (60 bytes)
In total this is 452 bytes, under an eighth of the previous approach.
At the heart of this approach is a secret sharing algorithm developed by Adi Shamir. A secret number is generated and the information needed to recover the secret is split into parts. Crucially, the number of parts needed to recover the secret can be specified and it can be less than the number of parts generated.
The wikipedia article linked above explains the algorithm and even includes an example written in python. Instead of repeating the explanation from there I will instead focus on how to connect the algorithm to a password.
Before jumping into the code I want to reiterate that I don't think this partial password strategy, regardless of the implementation, is very good. I wouldn't implement this in an actual project. If for whatever reason you do need to use this strategy you should test this implementation thoroughly.
Key derivation function
The first step is to convert the single character to a key suitable for encryption. To slow an attacker from brute forcing the password this process should be as computationally intensive as possible and key derivation functions have been developed specifically for this process. For encryption the key has to be 32 bytes.
import bcrypt import os password = "a" salt = os.urandom(16) key = bcrypt.kdf(password=password.encode('utf-8'), salt=salt, desired_key_bytes=32, rounds=50) print(key)
This code will produce something similar to the output below.
Getting encryption right can be difficult. Fortunately, the cryptography package for python has emerged as a relatively safe option and includes several recipes for common tasks. For symmetric encryption the package suggests using Fernet. It encourages using a cryptographically secure random number generator for the key, and neatly packages a good algorithm and block mode making it very difficult for any attacker to read the message being encrypted. A message hash is also made to ensure that the message is authentic and has not been modified.
In the majority of situations these are all good features but an authenticated message will actually compromise security for our task.
import base64 import bcrypt from cryptography.fernet import Fernet import os password = "a" salt = os.urandom(16) key = bcrypt.kdf(password=password.encode('utf-8'), salt=salt, desired_key_bytes=32, rounds=50) key = base64.urlsafe_b64encode(key) f = Fernet(key) token = f.encrypt(b"my deep dark secret") print("Encrypted message:", token) print("Decrypted message:", f.decrypt(token))
The code above will produce something similar to the output below
Encrypted message: b'gAAAAABa2l8hJlofKE1zXVace4NdAzk4G-LCpNuc3Jt6_DidVClWFf6FEWh-WWLGnVjywWgkENUxKzfHRq6Y6_Vc3Co1A0T1vn5ew6wQM4sjGg4AYf-wEQw=' Decrypted message: b'my deep dark secret'
If the correct character is given this works well. Now, with the incorrect character:
password = "b" key = bcrypt.kdf(password=password.encode('utf-8'), salt=salt, desired_key_bytes=32, rounds=50) key = base64.urlsafe_b64encode(key) f = Fernet(key) print("Decrypted message:", f.decrypt(token))
--------------------------------------------------------------------------- InvalidSignature Traceback (most recent call last) ~/miniconda3/lib/python3.6/site-packages/cryptography/fernet.py in decrypt(self, token, ttl) 100 try: --> 101 h.verify(data[-32:]) 102 except InvalidSignature: ~/miniconda3/lib/python3.6/site-packages/cryptography/hazmat/primitives/hmac.py in verify(self, signature) 68 ctx, self._ctx = self._ctx, None ---> 69 ctx.verify(signature) ~/miniconda3/lib/python3.6/site-packages/cryptography/hazmat/backends/openssl/hmac.py in verify(self, signature) 72 if not constant_time.bytes_eq(digest, signature): ---> 73 raise InvalidSignature("Signature did not match digest.") InvalidSignature: Signature did not match digest.
The message authentication failed, we got an exception, and we immediately know we have the wrong character for this position in the password. This is equivalent to the salted hash of each character with the woeful security offered by just 208 possible options.
It is vital we only know if the combination of multiple characters is correct and never an individual character. To achieve this we need to get back a value from decryption that looks plausible even if we have the wrong key. The cryptography package still allows us to do this, after we read through the warnings.
from cryptography.hazmat.primitives.ciphers import Cipher, algorithms, modes from cryptography.hazmat.backends import default_backend password = "a" salt = os.urandom(16) key = bcrypt.kdf(password=password.encode('utf-8'), salt=salt, desired_key_bytes=32, rounds=50) iv = os.urandom(16) partial_secret = 75073663912656191474987795000012827372 backend = default_backend() cipher = Cipher(algorithms.AES(key), modes.CBC(iv), backend=backend) encryptor = cipher.encryptor() ciphertext = encryptor.update(partial_secret.to_bytes(32, byteorder="little")) + encryptor.finalize() print("Ciphertext:", ciphertext) decryptor = cipher.decryptor() y = decryptor.update(ciphertext) + decryptor.finalize() y = int.from_bytes(y[:16], byteorder='little') print("Plaintext:", y)
Ciphertext: b'\xb2mQ\xb5\xea<\xe8\x98\x12#\x88-t9\x13P\x89=\x8e|5Q\x98\x16\x97*\xe0!L\xdb\x0b"' Plaintext: 75073663912656191474987795000012827372
Now, if we change the character from the password:
password = "b" key = bcrypt.kdf(password=password.encode('utf-8'), salt=salt, desired_key_bytes=32, rounds=50) backend = default_backend() cipher = Cipher(algorithms.AES(key), modes.CBC(iv), backend=backend) decryptor = cipher.decryptor() y = decryptor.update(ciphertext) + decryptor.finalize() y = int.from_bytes(y[:16], byteorder='little') print("Plaintext:", y)
The value returned looks plausible and there is no error
Now to check whether a character is correct we must go through the process of decrypting three parts of the secret and then verifying whether the recovered secret is correct.
Checking the secret
Generating the secret and shares and then recreating the secret from a subset of the shares is all handled by the shamir script.
import shamir secret, shares = shamir.make_random_shares(minimum=3, shares=8) print("Secret:", secret) print("Shares:", shares)
Secret: 101819857364438932439321523918342615485 Shares: [(1, 31524886020529511108308890309086814619), (2, 4579141086039571948169586631903681209), (3, 20982622560969114958903612886793215255), (4, 80735330445318140140510969073755416757), (5, 13696081278617415761304351476906179988), (6, 160147241981805405284658367528013716402), (7, 9665262173474413515511106079425708818), (8, 72673692235032135648924478278794474417)]
then with a subset of the shares the secret can be recovered
Putting it together
Given a password we create a secret and random salts, initialization vectors and a share of the secret for each character. For each character we then derive a key using the salt. Then using the key and the initialization vector the share is encrypted. The final step is to create a salt for the secret and then hash it. Finally, the salts, initialization vectors, encrypted shares and the hashed secret are all stored
To check a set of submitted characters the keys are derived using the stored salts, the shares are decrypted, the secret recovered and it is checked against the stored hash. If it matches, the password is correct.
All the code for this partial password implementation is available on github. Finally, although this was an interesting exploration I have no plans to use this and would council against anyone else using it. There are better options.