Paillier Cryptosystem for Homomorphic Encryption
Overview
Teaching: 0 min
Exercises: 0 minQuestions
What is Paillier encryption scheme?
What can be encrypted and calculated using Paillier HE?
How to save encrypted numbers?
Objectives
First learning objective. (FIXME)
Introduction
In this hands-on episode, we will focus on an implementation of HE that is available for Python programming language, that is using Paillier encryption scheme. We will be using the python-paillier library.
Paillier encryption scheme is similar to the popular
RSA algorithm
in that it involves exponentiation of a very large base with a very
large exponent.
The basic unit of plaintext (m
) in Paillier cryptosystem is an integer.
Real numbers can be approximated as very large integers.
Using Paillier encryption scheme, the following operations can be
done on the ciphertext (e.g. M1
, M2
, …):
-
Addition of a ciphertext with another plaintext:
M1 + n1 -> M2
-
Multiplication of a ciphertext with another plaintext:
M1 * n1 -> M3
-
Addition on two or more ciphertexts:
M1 + M4 -> M5
With these basic components, one can still do subtraction (with plain/ciphertext) and division with plaintext.
In this episode, we will learn how to:
-
Generate public/private key pair
-
Encrypt numbers
-
Decrypt numbers
-
Saving and loading data (keys and encrypted numbers)
-
Share our sensitive data to another party for remote computation without revealing our data
Preparation: Key generation
Paillier encryption starts with the generation of a keypair:
-
the public key (
pubkey
) is used for encryption; -
the private key (
privkey
) is for decryption.
# import additional stuff needed later on:
import json
# import Paillier library
import phe
from phe import paillier
pubkey, privkey = paillier.generate_paillier_keypair(n_length=2048)
What’s in a key?
A cryptography key is essentially a set of one or more numbers. Usually these are extremely large numbers that are difficult to compute, like prime numbers and the products thereof.
The private key consists of two prime numbers p
and q
.
(Some articles such as the Wikipedia’s
Paillier Cryptosystem
alternatively use the pair of greek symbols lambda
and mu
as the private key—they are just derived quantities from p
and q
.)
Formally, the public key consists of n
and g
.
By definition,
n := p * q
and most Paillier implementation (such as python-paillier we will be using later) uses a simple choice
g := n + 1
so effectively the public key only has n
.
What’s in your key?
Check your own public and private keys and examine the values of
p
,q
, andn
. Use theprivkey.p
,privkey.q
, andpubkey.n
attributes.Compare your values with your peer’s values. How large (order of magnitude) are these values? Hint: to measure the string length of an integer like those, use the
str
andlen
functions combined.Example solution
>>> privkey.p 165801263340632983976560662438308425856376892415968944445942078522128733590037545491114495778463344034457791215860577561908717660960279116282832632108138114113215470481000160667178582327571748667307748221469642828898728723202606759475506355383541570049534956671732505254974281600674365388829556194125478439663 >>> privkey.q 173692986048859584131292662978616306818667799173886008136009019862903330127902404198243622829355376568051859747081316805127608698230348868838407907757905356268206709017850161836762180321535363252843130114261965137350004486033503058254303012135765203279703631680903239188556052848628643153099498478342809602171 >>> pubkey.n 28798516520307858892245242752031788221019505752401640907847555469080110989819460404887769769114996691178956366902424297847922947959745336265127613942423489628418392856716544509021179099995101598581024448922988985325716248080094092790457195629831492960414027793811488320980935077935214695348015743458064652434646259278623218413099569743293632244873602245153101169163714018053846361736912463943370196927861624340029295752670877587921787341114737650043775604469578472260873497172429754696151563073203245718398299476346582842129611188422523870025644927967778458063530248617514979015051481160778109441927444069910857308373 >>> len(str(priv.p)) 309
The
p
andq
are over 300 digits long, whereasn
is over 600 digits!The number of bits are relatively easy to compute using the “%x” C-like string formatting and noting that each byte comprises of 8 bits:
# The count of hexadecimal digits in this number: >>> len("%x" % priv.p) 256 # The count of bits in this number (remember: a byte = 2 hexadecimal digits): >>> len("%x" % priv.p) * 4 1024
In real application, of course you won’t share your p
and q
values.
The n_length=2048
in the construction of the keypair sets the length
of the public key to 2048 bits.
This makes the length of each p
and q
to 1024 bits each.
Basic operations
Let us now encrypt some data and do some operations:
Encrypting numbers
Python-paillier supports the encryption of integers and real numbers.
The encryption will cryptically represent a plaintext as another number that
is very large in magnitude.
It is advisable not to use numbers that are extreme in the magnitude
(for example, 10**100
[ten raised to the one-hundredth power] or beyond)
because the extreme values can cause overflow in the representation of
the ciphertext.
Let us begin with encrypting two numbers m1
and m2
below:
m1 = 3.14159
m2 = 10.01
M1 = pubkey.encrypt(m1)
M2 = pubkey.encrypt(m2)
According to Paillier’s scheme, encryption of m is essentially raising
the value of (n+1)
to the power of m
, then multiplied with a specially
random integer that will obfuscate the M
value further.
We can examine the garbled values of M1
and M2
:
# What is this object?
>>> M1
<phe.paillier.EncryptedNumber at 0x1071fd048>
# What is the numerical value of the garbled message?
>>> M1.ciphertext()
358042382770970270871524800057088993221249913569455679842042704805236196798901110138194625868064580648366378045126433145589847592042155363527717057481962442927634034228963964441449764296013360397445353630883402829038700969195047512245998315288904243095699545878859254325699578614910293987106279004516694723811771678355852903333263976670616317424950250139809342812336448835863147832115469392284379745141185442407804051322178178067222135101866778737234526964369704209798907917843985713752348919395564785629545112505111864804352424749406544201602358442027344200150631027335061665436620838770976547147216884719120406406990135182420301787378701807752625145514333784471858890922709189677324828050487528956994674637706742097263253068935111408501343290249421788540787705161699618174447299562550175340316069533657736832883926749539295329978528335318595004290370344167608315106573144494125367974775195050854204929161874473987196886524872998925329629342955885015929491823940042109455343304878081339271308043826293205311947989826903150983651291506365361391467422114815705184215214751061898354179774360927559552458033988283761950762737874945446251124416375925915879200362054684310251908696240048698646030868448074744342597878483402868383817342731
>>> M2.ciphertext()
251500510465933159444129963041891475735958637022077320726753988995782543605875276064355046382636476224156927573410788265913428536281362746678685886477342515747725346205724908277948589850498950113032857048973248166230503904414015080108483413222096602346345325763118683239416195596913352139016432488199232132566385839598965555332124257078399090417720309411774421097191437214204796576044983316327834833222105146237548316050866369699850876429255645296741309710225470794517666162685190673394137156087317381360574331350929674031046372889099732764768604092375642787380395168571666222170013934385236219940414879886409254086394673631604203605436108581499762932943318959134010040793568875436218443906828584820850410509803672150295495821202471827088903655699694039407477528126302536535755178913446557295082494305811588939339905625856529970642483000600509412103276093981197711165525459371031282679106443571999891342788487025728031054793474962502015189375063908990710504205059139340048441494846749585770682660763741858638982000978845126174168249264753262086981396109082398653554959250888229072524248786310589381771608267693735086616061149065114993335429575126741915070311764384862452182088480319731960530717752449479079558471598704820992912826091
For a 2048-bit public key (n
), these ciphertexts are 1024-bit values.
There is a surprising feature with Paillier encryption: the ciphertext above are not unique.
Suppose we want to re-encrypt m1
into another encrypted number for the second time,
then the ciphertext will not come back the same:
>>> M1b = pubkey.encrypt(m1)
>>> M1b.ciphertext()
215320743737696792505629417916091495555024649807936320684353999140834571484779962921512277916909414641309636112370294503228090979147962780269526673009406021881896793720689090071312153650626495787508248861034327242415603410146990342661619334510416764998889942132905802824724239120675557183424815361330704267772947690005229292906897925217332204148870783543479608682054196054545195592919684891849750764136444683823057851495872857773215322251877541513716609320457821781879940992643784545715595574202450603214394953691090145907141824134262849406978878239026886759001237077626562486714003717466071035536213885602577090940590577178665013822391659484716541142683844179550123522737256279019202767987360008603243973448284620325533478542036736751743971556195842146128863327582513693071968646235738170809604622656200900953991868485445863364714104339124619095140215161163009357556551996754076825186466841820534524743827225220418936217000161402407525456696209429025630673991941894032978204250155747283840025096489965601737665129716201572068989600305650659617843415841667211599052292050537888713743676090424391401102401343332605608284942470259004441154791461641050713624527451566473359673037755770138383260106548157593847792994342362880470720049308
The garbled form is different, yet the underlying plaintext is the same! This is an intended feature of the encryption scheme: by just looking at the ciphertext, one will not know whether two values are identical. In fact, it is impossible to perform numerical comparison involving one or more ciphertexts.
Now let us perform operation:
>>> M3 = M1 + M2
# Create some constants
>>> n1 = 2
>>> n2 = 0.125
>>> M4 = M1 * n1
>>> M5 = M1 + n2
# Multiplication of two ciphertexts are not supported
>>> M6 = M1 * M2
... (some stack trace)
NotImplementedError: Good luck with that...
The results of the mathematical operations above (M3
, M4
, M5
)
are also encrypted numbers which are jibberish to those without the
decryption key:
>>> M3.ciphertext()
346271150828615488181846097847395851154922255082407573156807157423...
>>> M4.ciphertext()
217806911011334857004539935049659502210592047599395846344934617502...
>>> M5.ciphertext()
105199549075824746345972983983003901376943492867585255733113741110...
We can get back the actual numbers after numerical operations:
>>> m3 = privkey.decrypt(M3)
>>> m4 = privkey.decrypt(M4)
>>> m5 = privkey.decrypt(M5)
>>> print(m3)
13.151589999999999
>>> print(m4)
6.28318
>>> print(m5)
3.26659
How about cross-checking against the result of plaintext arithmetics?
>>> m1 + m2
13.151589999999999
>>> n1 * m1
6.28318
>>> m1 + n2
3.26659
Let’s play around with some numbers and make sure you are comfortable with the allowable operations!
Saving and loading encrypted data
Saving keypair
Every time generate_paillier_keypair()
is called, a different keypair
is generated, and the key values are random.
This randomness is part of computer security.
Therefore, there has to be a way to save and restore a keypair.
For this module, we will follow the approach suggested used by the authors of
python-paillier (refer to python module phe.command_line
in the
source code for the origin):
def keypair_dump_jwk(pub, priv, date=None):
"""Serializer for public-private keypair, to JWK format."""
from datetime import datetime
if date is None:
date = datetime.now().strftime('%Y-%m-%dT%H:%M:%S')
rec_pub = {
'kty': 'DAJ',
'alg': 'PAI-GN1',
'key_ops': ['encrypt'],
'n': phe.util.int_to_base64(pub.n),
'kid': 'Paillier public key generated by phe on {}'.format(date)
}
rec_priv = {
'kty': 'DAJ',
'key_ops': ['decrypt'],
'p': phe.util.int_to_base64(priv.p),
'q': phe.util.int_to_base64(priv.q),
'kid': 'Paillier private key generated by phe on {}'.format(date)
}
priv_jwk = json.dumps(rec_priv)
pub_jwk = json.dumps(rec_pub)
return pub_jwk, priv_jwk
The function above returns two json objects (which are basically two strings) from the public and private keys, respectively.
The public key can be written easily to a text file:
with open("phe_key.pub", "w") as F:
F.write(pub_jwk + "\n")
print("Written public key to {}".format(F.name))
print("n={}".format(pubkey.n))
Saving private key to file
Please follow the same pattern as the code snippet above to save your private key to a file called
phe_key.priv
.
Real-world challenge: Securely saving private key to file
We don’t use the simple
open
function to create a file to which we write the private key, because this is a security critical piece. Instead, we have to (1) delete the existing file (if it exists already), then (2) create the file with the exact permissions from the get-go (i.e. no group and world read-write permission bits). These steps are necessary so that no bystander can steal the private key information due to race condition. Challenge yourself to come up with a secure way to accomplish this.Solution
Please look at file
Paillier/create_save_keypair.py
in your exercise directory for a possible solution.
Loading keypair
The routine to recreate the public/private keypairs is given as follows:
def keypair_load_jwk(pub_jwk, priv_jwk):
"""Deserializer for public-private keypair, from JWK format."""
rec_pub = json.loads(pub_jwk)
rec_priv = json.loads(priv_jwk)
# Do some basic checks
assert rec_pub['kty'] == "DAJ", "Invalid public key type"
assert rec_pub['alg'] == "PAI-GN1", "Invalid public key algorithm"
assert rec_priv['kty'] == "DAJ", "Invalid private key type"
pub_n = phe.util.base64_to_int(rec_pub['n'])
pub = paillier.PaillierPublicKey(pub_n)
priv_p = phe.util.base64_to_int(rec_priv['p'])
priv_q = phe.util.base64_to_int(rec_priv['q'])
priv = paillier.PaillierPrivateKey(pub, priv_p, priv_q)
return pub, priv
Key re-creation is straightforward:
with open("phe_key.pub", "r") as F:
pub_jwk = F.read()
with open("phe_key.priv", "r") as F:
priv_jwk = F.read()
pubkey, privkey = keypair_load_jwk(pub_jwk, priv_jwk)
Saving encrypted number
An encrypted number needs to be accompanied by its associated public key
for it to be operable mathematically.
Therefore, when saving one or more encrypted numbers (for example
to a disk file), one must include the public key,
or have a way to re-associate the public key when reloading the data.
In the routines below we include the public key in the dump and load functions.
The envec_
prefix stands for “encrypted vector”, where
vector just means a list of one or more numbers.
def envec_dump_json(pubkey, enc_vals, indent=None):
"""Serializes a vector of encrypted numbers into a simple JSON format."""
from phe.util import int_to_base64
R = {}
R['public_key'] = {
'n': int_to_base64(pubkey.n),
}
R['values'] = [
(int_to_base64(x.ciphertext()), x.exponent) for x in enc_vals
]
return json.dumps(R, indent=indent)
def envec_load_json(R_json):
"""Deserializes a vector of encrypted numbers."""
from phe.util import base64_to_int
R = json.loads(R_json)
R_pubkey = R['public_key']
R_values = R['values']
# deserialized values:
pubkey_d = paillier.PaillierPublicKey(n=base64_to_int(R_pubkey['n']))
values_d = [
paillier.EncryptedNumber(pubkey_d, ciphertext=base64_to_int(v[0]), exponent=int(v[1]))
for v in R_values
]
return pubkey_d, values_d
All these routines are collected in a Python module called
paillier_tools
, available in your hands-on directory
(file Paillier/paillier_tools.py
).
Please take a second to look into this file and understand the purpose
of each functions contained therein.
There are two additional files, read_file
and write_file
, which
are meant to read or write text data (represented as a single Python
string).
Creating your own keypair and saving it to disk
As an exercise, please write a script called
mykeys.py
that creates a keypair, then save the public and private keys to two files:phe_key.pub
andphe_key.priv
, respectively.Hint: this is the beginning part of the program:
import json import os import phe from phe import paillier from paillier_tools import * pubkey, privkey = ... # complete on your own
Hint: use the
paillier.generate_paillier_keypair
,keypair_dump_jwk
, andwrite_file
functions.Solution
A possible solution is provided in
create_save_keypair.py
file in your hands-on directory.Note: this script employs a technique to store the private key securely, which will be necessary when programming real applications! But we don’t need to worry for this in our hands-on activities; simply use
write_file
.
Now we are ready to perform some fun operations with the Paillier homomorphic encryption!
Key Points
Paillier cryptosystem permits addition of two ciphertexts, or one ciphertext and one plaintext, or multiplication of one ciphertext and one plaintext.