Fun with numbers

Information about Fun with numbers

Published on November 12, 2008

Author: phwtoo



Fun with numbers : Ngiam Shih Tung December 22, 2003 Fun with numbers Breaking the NRIC check digit algorithm Introduction : Introduction The algorithm for computing the check digit for Singapore identity card numbers is unpublished Algorithm is partially described in various open sources Objective of this exercise is to elucidate the complete algorithm from internet resources and “virtual experimentation” UIN/FIN structure : UIN/FIN structure The National Registration Identity Card (NRIC) number is the Unique Identification Number (UIN) or Foreigner Identification Number (FIN) Century prefix S, T - 19th and 20th letters of alphabet for UINs issued in 19xx and 20xx respectively F, G - Foreigners (not 7th and 8th century !) Check digit (official reference) Computed from first eight characters of UIN/FIN Detects data entry errors How do we calculate this ? UIN/FIN algorithm : UIN/FIN algorithm Government will release UIN/FIN algorithm for computing check digit, BUT “Application is open ONLY to Singapore-based organisations with the legitimate need for the UIN/FIN validation.” “Your application is subject to our final approval and our decision shall be final” License agreement requires: “The Licensee agrees to take all reasonable steps to protect the Licensed Material from unauthorised copying, adaptation or use.” License fee Algorithm $200 Sample code $400 Source: ICA website ( IP AnalysisCan the government really prohibit unauthorised use ? : IP AnalysisCan the government really prohibit unauthorised use ? Copyright Source code is subject to copyright Algorithms are not subject to copyright Patent Algorithms are patentable, but Patent must be published Prior art probably exists in this case Patent, if any is long expired (> 20 yrs) Trade Secret May be protectable under the license agreement BUT, no secret if the information is already publicly available or obtained via a different route Modulo 11 checksum : Modulo 11 checksum Algorithm for S-series (old-style) NRIC numbers is well-known* d = [(d1 d2 d3 d4 d5 d6 d7) • (2 7 6 5 4 3 2 )] mod 11 = ( 2d1 + 7d2 + 6d3 + 5d4 + 4d5 + 3d6 + 2d7 ) mod 11 Lookup d: Weights 7-digit NRIC number Does this work for F, G, T-prefix UIN/FINs ? * e.g. soc.culture.singapore newgroup postings (1995) 1 Reverse Engineering the FIN algorithm : Reverse Engineering the FIN algorithm Find a large set of FINs then reverse engineer the check digits to determine weights and mapping of checksum to letters MOM publishes a list of Registered Safety Officers on its website 48 out of 1,287 Safety officers are foreigners with FINs By inspection, same algorithm and same weights are used but with different check letters: FINs extracted from MOM website Checksums calculated using formula 1 21st century UINs - T & G prefix : 21st century UINs - T & G prefix Difficult to obtain large list of T-and G-series UINs Children born and foreigners registered during or after 2000 Solution: Use a brute force approach and rely on the National Library web interface to check accuracy of guess Virtual ExperimentVerifying UIN/FIN check digits : Virtual ExperimentVerifying UIN/FIN check digits Enter Test UIN/FIN Guess check digit (letter) corresponding to IC number Guess incorrect Guess correct ! Enter any name / birth month NLB Online Services or 21st century UIN/FIN check digit : 21st century UIN/FIN check digit By exhaustive search, we conclude for T-prefix UINs Same weighting factors and modulo 11 algorithm is used but Mapping of check digits is shifted 4 places Similar shift is observed for G-prefix FINs Shift 4 places Shift 4 places Universal UIN/FIN Check Digit Algorithm : Universal UIN/FIN Check Digit Algorithm For any UIN/FIN of format P d1d2d3d4d5d6d7 C where P = Century prefix {S, T, F or G} di = Number, i = 1..7 C = Check Digit (letter) d = { d0 + [(d1 d2 d3 d4 d5 d6 d7) • (2 7 6 5 4 3 2 )] } mod 11 d0 = 0 for P = S or F = 4 for P = T or G Check digit is determined by prefix and value of d References : References UIN algorithm described in chapter 3 of course notes for NUS Coding Theory course ( S & T prefix algorithm confirmed No known public references to F, G-prefix FIN algorithm Other checksum implementations Hong Kong Identity Card HKID uses numerical check digit, e.g. B255241(3) Check digit given by modulo 11 checksum with weights (8, 7, 6, 5, 4, 3, 2) where letter prefix is converted to number A=1, B=2, etc. Use X if remainder is 10 International Standard Book Number (ISBN) ISBN is 9 digit number with check digit given by modulo 11 checksum Weights (1, 2, 3, 4, 5, 6, 7, 8, 9) Use X if remainder is 10 Points to Ponder : Points to Ponder Why modulo 11 ? For numerical check digit, using modulo 11 allows checksum to be written as single digit (10 = X) For alphabetic check digit, modulo 26 is more likely to detect errors Why weights (2, 7, 6, 5, 4, 3, 2) ? Is there an optimal weighting scheme (compare to HKID, ISBN weighting factors) ? Why ABCDEFGHIZJ for S-prefix UINs ? Will there be U-series UINs in 2200 ?

Related presentations