Monday, March 19, 2012

calculate base 32 notation from int value?

hello everyone,
i have an artificial keying system that is based upon a psuedo-randomly
changing 31 bit pattern stored in the database as an int.
the current key is stored in a table like so:
create table artificialkeys (
keyname varchar(50) primary key not null,
keyvalue int)
go
insert into artificialkeys (keyname, keyvalue)
values ('accountnumber', 1)
go
and the key is used by an entity such as this:
create table accounts (
accountnumber int primary key not null,
accountname varchar(50) not null)
go
and the key is incremented / implemented with stored procedures, such
as this:
create procedure [owner].[insert_account] (
@.accountnumber int output,
@.accountname varchar(50))
as
update artificialkeys set keyvalue = @.accountnumber = (currentvalue /
2) + ((currentvalue % 2 + ((currentvalue / 8) % 2)) % 2) * power(2, 30)
where keyname = 'accountnumber'
insert into accounts (accountnumber, accountname)
values (@.artificialkey, @.accountname)
go
this is the current setup, and it's working just fine. we get a
pseudo-random integer as our primary key with a value between 1 and
2^31 (due to the nature of the algorithm which i won't get into).
now comes the problem, and my question. these account numbers are
presented as ten digit strings to the customers (with leading zeros as
necessary), such as '0054613854', just a string version of the decimal
int value. no problem.
however we are moving to new accounting software, and there is a limit
of 6 characters for the account number, but it can still be
alphanumeric. and that brings me to my current plan. the value range,
if represented in base32 notation, would be from 1 to 4000, well within
the range the new software needs.
but this presents some problems that i thought i'd ask here. for
example, in base32 notation, some values are no longer acceptible. for
instance, i don't want someone ending up with the account number 'XXXX'
... so i'd like to impose a restriction that no more than two
consecutive digits can be above 9 in value.
also i'm not sure how i can convert the int value into a base32
representation inside the stored procedure, to be stored as a string
primary key?
the more i think about this the more it seems like i should handle this
in the middleware?
thanks for any thoughts,
jasonjason wrote:
> hello everyone,
> i have an artificial keying system that is based upon a psuedo-randomly
> changing 31 bit pattern stored in the database as an int.
> the current key is stored in a table like so:
> create table artificialkeys (
> keyname varchar(50) primary key not null,
> keyvalue int)
> go
> insert into artificialkeys (keyname, keyvalue)
> values ('accountnumber', 1)
> go
> and the key is used by an entity such as this:
> create table accounts (
> accountnumber int primary key not null,
> accountname varchar(50) not null)
> go
> and the key is incremented / implemented with stored procedures, such
> as this:
> create procedure [owner].[insert_account] (
> @.accountnumber int output,
> @.accountname varchar(50))
> as
> update artificialkeys set keyvalue = @.accountnumber = (currentvalue /
> 2) + ((currentvalue % 2 + ((currentvalue / 8) % 2)) % 2) * power(2, 30)
> where keyname = 'accountnumber'
> insert into accounts (accountnumber, accountname)
> values (@.artificialkey, @.accountname)
> go
> this is the current setup, and it's working just fine. we get a
> pseudo-random integer as our primary key with a value between 1 and
> 2^31 (due to the nature of the algorithm which i won't get into).
> now comes the problem, and my question. these account numbers are
> presented as ten digit strings to the customers (with leading zeros as
> necessary), such as '0054613854', just a string version of the decimal
> int value. no problem.
> however we are moving to new accounting software, and there is a limit
> of 6 characters for the account number, but it can still be
> alphanumeric. and that brings me to my current plan. the value range,
> if represented in base32 notation, would be from 1 to 4000, well within
> the range the new software needs.
> but this presents some problems that i thought i'd ask here. for
> example, in base32 notation, some values are no longer acceptible. for
> instance, i don't want someone ending up with the account number 'XXXX'
> ... so i'd like to impose a restriction that no more than two
> consecutive digits can be above 9 in value.
> also i'm not sure how i can convert the int value into a base32
> representation inside the stored procedure, to be stored as a string
> primary key?
> the more i think about this the more it seems like i should handle this
> in the middleware?
> thanks for any thoughts,
> jason
To eliminate the offensive words why not just exclude vowels from the
account numbers. That would only be base 31 but is generally acceptable
unless you want to start worrying about stuff like "SH1T". Maybe
another approach would be to construct a table of the smallish set of
unacceptable numbers and check against that each time.
Here's a generalised function for converting integers to a number base
defined by any character set:
CREATE FUNCTION dbo.IntToBase(@.i INTEGER, @.base_charset VARCHAR(256))
RETURNS CHAR(4)
AS
BEGIN
DECLARE @.b TINYINT, @.d1 TINYINT, @.d2 TINYINT, @.d3 TINYINT, @.d4 TINYINT
SELECT
@.b = LEN(@.base_charset),
@.d1 = FLOOR(@.i/@.b*@.b*@.b)%@.b,
@.d2 = FLOOR(@.i/@.b*@.b)%@.b,
@.d3 = FLOOR(@.i/@.b)%@.b,
@.d4 = @.i%@.b
RETURN
SUBSTRING(@.base_charset,@.d1+1,1)+
SUBSTRING(@.base_charset,@.d2+1,1)+
SUBSTRING(@.base_charset,@.d3+1,1)+
SUBSTRING(@.base_charset,@.d4+1,1)
END
GO
/* Base 31 characters */
SELECT dbo. IntToBase(1234,'0123456789BCDFGHJKLMNPQR
STVWXYZ');
/* Base 16 characters */
SELECT dbo.IntToBase(1234,'0123456789ABCDEF');
David Portas, SQL Server MVP
Whenever possible please post enough code to reproduce your problem.
Including CREATE TABLE and INSERT statements usually helps.
State what version of SQL Server you are using and specify the content
of any error messages.
SQL Server Books Online:
http://msdn2.microsoft.com/library/ms130214(en-US,SQL.90).aspx
--|||Oops. That wasn't very well tested. Here's a correction:
CREATE FUNCTION dbo.IntToBase(@.i INTEGER, @.base_charset VARCHAR(256))
RETURNS CHAR(4)
AS
BEGIN
DECLARE @.b TINYINT, @.d1 TINYINT, @.d2 TINYINT, @.d3 TINYINT, @.d4 TINYINT
SELECT
@.b = LEN(@.base_charset),
@.d1 = FLOOR(@.i/@.b/@.b/@.b)%@.b,
@.d2 = FLOOR(@.i/@.b/@.b)%@.b,
@.d3 = FLOOR(@.i/@.b)%@.b,
@.d4 = @.i%@.b
RETURN
SUBSTRING(@.base_charset,@.d1+1,1)+
SUBSTRING(@.base_charset,@.d2+1,1)+
SUBSTRING(@.base_charset,@.d3+1,1)+
SUBSTRING(@.base_charset,@.d4+1,1)
END
GO
David Portas, SQL Server MVP
Whenever possible please post enough code to reproduce your problem.
Including CREATE TABLE and INSERT statements usually helps.
State what version of SQL Server you are using and specify the content
of any error messages.
SQL Server Books Online:
http://msdn2.microsoft.com/library/ms130214(en-US,SQL.90).aspx
--|||> To eliminate the offensive words why not just exclude vowels from the
> account numbers.
"And Serpent bringeth forth the vowel."
- Luke, 7-45
How about keeping a table of 'illegal' values? There can't be that many if
the string is only 4 characters long.
ML
http://milambda.blogspot.com/|||actually you're right. in fact, since the base value never goes above
2^31, the highest base32 value would be 4000, which means the first
digit never even gets to the letters. so now we're just talking about
three letter offensive words. of course we'd have to worry about
homophones too, like FUK. i'm not sure which would be easier, but i'll
consider both.|||oh wow, so i could essentially create my own notation that has no
vowels. i didn't realize that was even going to be an option, but hell
yeah, that seriously makes the offensive word thing a lot easier!
and yeah, we considered the 'leet' curse word issue, but decided that's
just too annoying to worry about. if someone gets account number SH1T,
and doesn't chuckle, then we can survive the loss.
thanks!
jason|||"jason" <iaesun@.yahoo.com> wrote in message
news:1138387792.331310.253560@.g43g2000cwa.googlegroups.com...
> actually you're right. in fact, since the base value never goes above
> 2^31, the highest base32 value would be 4000, which means the first
> digit never even gets to the letters. so now we're just talking about
> three letter offensive words. of course we'd have to worry about
> homophones too, like FUK. i'm not sure which would be easier, but i'll
> consider both.
ASS
FAG
KKK
and of course, the most offensive one of all... :-)
DBA|||FYI
Actually, the word "fuk" has that same meaning in several other languages.
Of course it's pronounced differently.
Maybe there already is a list of offensive words available somewhere.
http://en.wikipedia.org/wiki/Seven_dirty_words
ML
http://milambda.blogspot.com/|||> and of course, the most offensive one of all... :-)
> DBA
You forgot:
DEV
:)
ML
http://milambda.blogspot.com/|||hey David,
thanks, this function looks pretty versatile. i'm having an
implementation problem with it though. when i run it against the
existing list of account numbers, i'm getting a looping result. int
value 2147186326 is coming out as '0001' in base31 notation, which is
the same value that the int value 1. i thought perhaps that base31
couldn't store an int in only 4 digits, and this was a truncation
problem, so i tried to extrapolate the function to return base x
digits, but that didn't actually change the behavior, there was still a
notation wrap at 2147186326, so i'm sure i didn't modify it correctly.
just thought i'd run it by you to see if you had any ideas?
thanks again,
jason

No comments:

Post a Comment