Redo
Ever heard the term “GIGO”? If not, it means “Garbage In, Garbage Out”. That’s exactly what has happened to the SQL data warehouse I worked over a year on. The main data source, a student information system (SIS), is so fubar that I can’t guarantee the data.
The problem stems from lazy end users (about 80) of the SIS. Instead of checking if a student already existed, or heeding the error that a student already existed, they would just change some small detail about the student. This little “work around” causes duplicate entries in the SIS. Trying to code around the problem is out of the question simply because there are too many variables. Plus, when the duplicates are found in the SIS, I wouldn’t know which entry is correct so I may import the invalid entry while ignoring the valid one.
This sucks on so many levels. The highest level is the fact that the best option is to blow away what I have and start over. That will involve not only importing information from the SIS but also importing every standardized test (ACT, SAT, etc.) again in order to match it to a student.
I have brought this issue to the attention of those in charge but it was pointless. The answer is always “Maybe they just need more training”. Yeah, that’s it. The other four times they were trained to not do it was wrong. Fifth times a charm.
Idiocy - Never underestimate the power of stupid people in large groups. - Despair, Inc. Idiocy poster
25.Jul.08
Microsoft SQL
Comments (2)
A new race
One of the database projects I’m working on involves importing data from an existing student information system (SIS). The main data being imported is student name, date of birth, gender and ethnicity. The SIS stores the ethnicity of the student, in the student table, using an ID number that corresponds to the ID in the ethnicity table. I thought I’d just do a straight import of that table in order to keep from having to create a different ethnicity ID.
I start creating tables on my development SQL Server database by looking at the SIS tables, data types and data. I do that so that I don’t accidentally create a column with a data type of integer that has to be changed later to varchar or something. When I get to the SIS ethnicity definition table, I got a little confused. I saw the normal ethnic categories: White/Caucasian, Black/African American, American Indian, etc. The last three made me say, “Huh?”.
It seems that someone in the school system decided that the available ethnic categories didn’t really fit their students so they created new categories.
- >001
- >002
- Yes
Seriously. I know at least one school was using those categories because I was able to query the student table using the ID numbers for those ethnicities. Thankfully there were only 5 or 6 results out of about 20,000 kids. I brought it to the attention of the people in charge of the SIS. They said they would get it fixed right away and couldn’t understand how it happened.
Lesson they learned: Never underestimate your users.
The unforgivable crime is soft hitting. Do not hit at all if it can be avoided; but never hit softly. - Theodore Roosevelt
07.Feb.08
Humor, Microsoft SQL
Comments (0)
Trend Micro and MS SQL
Trend Micro software doesn’t play well with SQL logins. OK, SQL logins work but not in the way that is considered a security best practice: Windows Authentication. This became a nuisance during a recent Trend Micro IMSS installation.
The person in charge of installing the upgrade wanted to move the suite and database to another server. When he got to the step to provide the SQL Server username and password, he would put in the Windows domain account information. IMSS would then give an error indicating it couldn’t communicate with the SQL Server. I checked the SQL logs and it was showing that the login failed. He tried several times, making sure the password was correct, but it would fail every time. I had him try changing the domain username to domain\user and it would still fail.
We both checked the admin and installation guides for IMSS. The guides didn’t have any information about what could be causing the error. After about 20 minutes, it finally hit me. We had the same problem when we moved the Trend Micro OfficeScan Server to a new Windows and SQL server a couple of months ago. The admin guide for the OfficeScan server specifically stated that the installation did not support Windows Authentication with SQL Server. You must use a login that is set up on the SQL Server.
We tried the installation again, using the SQL Login I had set up for the OfficeScan Server install, and it worked fine. I did have to give Create any database rights to the login because the installation will not use an existing database. Why it won’t use an existing DB and just drop the tables/views/stored procedures is unknown to me. Instead, if a database with the name provided already exist, the install will drop the database and recreate it.
Untold suffering seldom is. - Franklin P. Jones
17.Oct.07
Microsoft SQL, Software
Comments (0)
Microsoft SQL: Rounding time to the nearest quarter hour
A web developer at work wanted to make a change to one of her pages. Instead of calculating the number of hours someone worked after pulling their time from the SQL database, she wanted to know if it could be calculated before the result set was returned. I thought it was a good idea because there may be times when multiple people simultaneously access their time sheets. Calculating the time on the SQL Server would be faster compared to the calculating it on the web server.
(more…)
17.Sep.07
Microsoft SQL
Comments (0)
IP Address storage in Microsoft SQL
Work has been hectic for the last couple of weeks. I am working on creating one database while updating/tweaking another one. The database I’m developing isn’t causing me any problems because it’s pretty straight forward. The other one, though, is causing me a lot of headaches.
One of the improvements I’m adding to the second database is the ability to view all IP addresses available at any given location to help with inserting static IP use records. Sounds easy enough until you consider that I don’t store every possible IP address for the locations. There are 30+ locations and each location IP range supports over 1,000 IP’s (Class B subnet). Because each location uses less than 100 static IP’s, I don’t see a reason to store all those IP’s without any correlating data.
The initial database creation only had two tables for IP storage: location IP range & equipment (computer, printer, etc.) static IP mapping. The location IP ranges table holds the location ID (details are in another table) and the individual IP octets. That means a range like 192.168.1.1 - 192.168.5.255 uses eight (8) columns to store the range. That was the first tweak, breaking down the IP addresses already stored and storing the individual octets.
I know it sounds odd to not store the IP as a whole, i.e. 192.168.1.1, but there are a couple of good reasons. One is storage space and the other is the ability to perform arithmetic operations against the IP without having to parse it using PARSENAME. The second reason came in really handy for the stored procedure showing all possible IP addresses for the locations.
The storage space savings isn’t huge but it can add up later. I found an article showing why it’s better to store the IP octets individually in tinyint columns. Tinyint works well for storing IP octets because it only supports numerical values from 0 to 255. Storing the IP this way also helps if I want to index the octet columns (combined and/or individual) to make searches faster.
Another plus is sorting the data returned. While stored as a whole IP, sorting the returned record set can cause unexpected outcomes (i.e. .100 before .20). Storing the individual octets allows me to sort the data by octet three and four (respectively) to get the record set sorted in the proper order.
Storing the IP addresses as individual octets can cause a problems. One of the problems is it can complicates queries. Queries to find the full IP address have to concatenate the cell values. So, if 192.168.1.1 is stored as individual octets and you want to pull the IP back with the decimal point, the query would look something like this:
SELECT CAST(Octet1 AS VARCHAR(4)) + '.' + CAST(Octet2 AS VARCHAR(4)) + '.' + CAST(Octet3 AS VARCHAR(4)) + '.' + CAST(Octet4 AS VARCHAR(4)) AS IPAddress
FROM IPTable
The octets have to be explicitly converted or SQL will try to create a decimal number with the concatenation. The I/O cost to cast tinyint to varchar is minimal.
I’ll post an example of the stored procedure to show all possible location IP addresses on the next post. The stored procedure will not only return all of the IP addresses in the range but it will also show information about any equipment using a static IP address (or nothing if the address is free).
Men do not quit playing because they grow old; they grow old because they quit playing. - Oliver Wendell Holmes
21.Aug.07
Microsoft SQL
Comments (0)






















