As we write the SQL code, we'll enter it into a well-formed SQL parser to ensure completeness. We'll actually create and test the database as we go along! Which SQL should you use? We have taken great pains to make all the following code completely compliant with the SQL-92 standard. Any program that complies with this standard should work. (We don't generally get into the fancy frills of SQL-99).
If you're using Windows, we strongly recommend that you get a copy of Ocelot SQL. This is the most compliant parser we've come across, and has numerous other merits, which we mention elsewhere. If you're working under Linux, then PostgreSQL should do the trick! (We might occasionally refer to Ocelot along the way, to ease your passage in certain areas). |
We will go about things as follows:
If you glance through the site, you'll see that there is a lot of denormalised data. There are over three hundred complex pages, with (to make things worse) several hierarchical levels. Let's try and break all this complexity down, and create a normalised database.
Please note that we are not creating active server pages (ASPs) - all we are doing is extracting and re-synthesizing data. The data thus extracted could of course be used in an ASP environment, but this is not our primary goal!
In more detail:
After contemplating this horrorshow, we decide that we are (initially at least) going to target Levels 3, 4, and 6, marked with *stars above. This decision is because the other pages are incredibly static, and although it will be fairly easy to extend our data extraction and webpage synthesis to include these pages, there is little point. We target the pages that will change the most - those with product and supplier details.
But note that each component of the lists of item characteristics is a unique individual property. More of this later!
|
Before we start, let's agree on the colour coding used below:
|
Without further ado, let's pull out the tables we consider relevant. We won't painstakingly go through the denormalised appliance and dealer information, successively lopping off bits and normalising. We'll simply give you the tables we think are required, followed by some justification of our choices..
MANUFACTURER {Brand} | |||
Name | Type | Size | Description |
ManufacturerCode | decimal | 4,0 | Up to 9 999 different manufacturers, database generates code |
ManufacturerName | varchar | 20 | The actual brand (manufacturer) name |
Think about the BRAND of an item (eg a Sony television set, or whatever). Although at present we don't have any more detail about the manufacturer on our website, this doesn't mean that we won't in the future. It certainly doesn't mean that we won't want to store say, Manufacturer contact numbers in our database! So, right at the start, let's pull out the "Brand", give it the more appropriate name of "Manufacturer", and create a separate table for it. Now, it's clumsy to have a long name (which may even change) as a key, so we'd rather generate an internal code, and use this code when we refer other tables to the Manufacturer!
CATEGORY {includes sub and sub-sub categories} | |||
Name | Type | Size | Description |
CatCode | decimal | 4,0 | Database generates code. We will use zero as the code for the "ROOT CATEGORY", that is appliances! |
CatParent | decimal | 4,0 | SELF-REFERENTIAL |
CatName | varchar | 40 | name to be used for the category/sub..category |
CatDesc | varchar | 64 | An optional text description of the category {unused at present} |
CatAlign | char | 1 | Options are 'H' and 'V', corresponding to the two different menu formats (horizontal and vertical) in level 3 |
Categories are such things as TV sets and washing machines, or even say "Fridges with top freezer, 400 to 549 L". They are a bit of a blighter. One category can depend on another, and finally we get a list of appliances within the category. An appliance can be the child of a category, a sub-category, or a sub-sub-category. In making this table, we've been a bit sneaky, and turned things around, by making the CatParent field refer up to another "super-category", and so on. The table actually self-referentially (recursively) talks about itself! We can use this trick to define a whole 'tree' of categories and sub- categories, and pull out the relationships between them when we need to!
One more wrinkle! Remember that there were two different types of level 3 menu - simple ones with the picture and text information about an appliance all in a row (we called these format H, for horizontal), and those with tall pictures, where the pictures were alternately on the left and right of the text (format V). Think about it - each of our categories that contains such a table will therefore fall into one of two groups (H, or V). If we ever come across a category where V or H doesn't apply, then we'll simply conveniently put in an 'X' (for "don't care").
MODEL {of appliance} | |||
Name | Type | Size | Description |
ModelCode | decimal | 8,0 | Up to 99 999 999 models [excessive!], database generates code |
ModelName | varchar | 15 | External serial number or code of model {is 15 enough?} |
ModelDesc | varchar | 64 | {not used at present ???} |
ModelParent | {AS for CATEGORY.CatCode: the immediate parent} | ||
ManuCode | {As for MANUFACTURER.ManufacturerCode} | ||
Price | decimal | 8,2 | Price max of 999 999.99 |
Model is a biggie. Again, rather than using an unreliable external code (which may even be changed by the manufacturer, although the specifications are for our purposes identical!), we generate a unique internal code. We also include an (optional) field for a model description, as we anticipate its usefulness, despite the lack of such information on the web-page.
A question. How can we signal that a price is INVALID? (For example, we know the price should be updated, but we dont have the datum to hand, and we must update the rest of the database). Setting the value to 0 is too ugly to be contemplated. Our solution will be to store a negative number, (minus the previous valid price), to signal to the database that [INVALID] must be signalled. This would trigger two things (a) Someone would get a gentle reminder to find out the true price, and (b) On the web page we would be more genteel with something like "Price on application"! Should we disallow price values below or above a certain value (have a range check)? We should certainly disallow zeroes!
What would happen if rampant inflation hit our third world country? Unlikely, but even if it did, we would just increase the size of the price field. This would not affect the data.
PICTURE - small and large pictures of appliance, {provisional - see below} | |||
Name | Type | Size | Description |
PicCode | decimal | 8,0 | Up to 99 999 999 distinct pictures, database-generated code |
PicName | varchar | 20 | actual name of picture, including .jpg or .gif suffix, but not the standard "images/" prefix ie path. |
PicAltTag | varchar | 30 | Description of the picture, as in HTML Alt tag |
PicWidth | decimal | 3,0 | pixel width of picture, up to 999 [?!] |
PicHeight | decimal | 3,0 | pixel height of picture, up to 999 [?!] |
ModCode | {As for MODEL.ModelCode} | ||
PicUsage | char | 1 | big or thumbnail, coded as B or T. |
Note the similarities between big and small pictures. They both have names, widths, heights, and alt tags. So we store them in the same menu. It would seem crazy to do otherwise, but note that there is a teeny normalisation problem, as (if you go back and look at the alt tags on the web-page), the alt tags actually contain serial numbers of appliances! We won't fuss about this. What we will do is smarten up the Perl script that pulls the data out of the HTML pages. It simply ignores appliance serial numbers in alt tags when making our giant denormalised table of data. Then, when we re-generate the tags from the database, we insert these by default. {Needs fleshing out, lots of thought! Note: we might have a check on picture size, with maxima and minima for the two main types of picture. A minor frill}
FEATURE {listed item characteristics} | |||
Name | Type | Size | Description |
FeatureCode | decimal | 8,0 | Up to 99 999 999 distinguishable item features; database- generated code |
FeatureType | varchar | 15 | Single word best describing feature |
FeatureText | varchar | 255 | Text description of feature |
The long lists of features of each appliance (contained in the individual appliance HTML pages) are a major headache. They are non-standardised comments about the important features of all of the appliances. Punctuation is idiosyncratic. Clearly many of the comments are similar, but not identical. Automatically capturing this information in a standardised form is a pain. We will aim for at least some degree of standardisation, and in addition incorporate the facility for later trying to identify key words within each piece of blurb - for example, "width", "height", "VHS", "PAL", "NTSC" and so on. It is also possible that features in the left hand column of one page may appear on the bottom of the next, and vice versa, so we won't here distinguish between the locations of the features on individual web pages.
Clearly, summary features will usually be shorter than detailed features, but as we are using varchar for the field type, this shouldn't adversely affect performance or storage space.
{Note that automatically getting intelligible words into the FeatureType field is a non-trivial task. We will not here address this}
MOD_FEATURE {associate model and various features} | |||
Name | Type | Size | Description |
FeCode | {As for FEATURE.FeatureCode} | ||
MoCode | {As for MODEL.ModelCode} | ||
PutMeAt | decimal | 1 | Where do I put model feature? Coding might be L=left, B=bottom, S=summary, but then what about combinations. Best have S=1, L=2, B=4, then can combine!! |
MyNumber | decimal | 8,3 | If there is a decimal associated with the model feature, it goes here. |
Priority | decimal | 2,0 | Precedence of item. See comment below. |
Aha! Here we associate features with particular models. The PutMeAt (put model feature at) field allows us to decide where the feature should go (in the summary text of a menu that describes several appliances, or in the detail menu for a single appliance).
{Note the field MyNumber. We won't here implement use of this field, but the intent is clear. Where a number is associated with a feature (e.g. 'Width = 333 mm'), we intend to eventually extract the number and have a generic feature ('Width = * mm'). Multiple numbers in a single line will clearly present a substantial problem - avoid this by breaking up the lines}
Note that we have another problem. The definition of a relational database doesn't allow row order, a completely alien concept. There is no doubt that we will want certain features to appear prominently (and even more importantly, fairly consistently) in our final HTML code. There is no guarantee that if we put features of a particular appliance into our database in a particular order, we will get them out in that order. So we need to "prioritise" our features. This explains the priority column. {explore consequences!}
{A small but irritating further problem is that summary features and other features may overlap (with redundant information)}
|
You can see that there are a lot of implicit associations within the data. For example, the telephone code is related to the region and subregion in a complex fashion. (The first three digits of the telephone number will also provide detailed information about the location of the supplier). One of the address lines will probably be associated with subregion, and telephone code. There are other problems that may not immediately be apparent, for example a supplier may have more than one telephone, currently indicated by a slash/slashes in the (longer, denormalised) telephone number. Larger suppliers may be able to cover several subregions. And so on.. We do as follows:
PLACE | |||
Name | Type | Size | Description |
PlaceCode | decimal | 7,0 | Generated by database. Up to 9 999 999 distinct places. |
PlaceName | varchar | 25 | See text below |
PlaceIsIn | {As for PLACE.PlaceCode. Recursive} | ||
PlaceSize | decimal | 1,0 | Possible values <= 3 minor, 4=subregion, 6=region. See text. |
Sneaky. We make PLACE recursive on itself. This allows us to build an implicit hierarchy of places. A supplier may live at The Mews, 33 Mandela Crescent, Little Suburb, Big Town, Gauteng; or simply in Johannesburg, Gauteng. Knowing just the first part of the address, we can reconstruct the whole, and conversely we can answer questions like "How many suppliers are there in Gauteng", applying the recursive argument in reverse. Some caution is advised because (a) The Mews (in our example) could be the name of locations around the country; (b) In this schema, 33 and 34 Mandela Crescent are not regarded as being spatially associated; (c) There may be breaks in the chain of continuity - that is, somebody may live in SmallSuburb, and we may not know where it is in the big scheme of things; (d) Once we start introducing the addresses of people other than suppliers, the logic becomes messy, because now if we work down from the top of the tree (Gauteng) to individual addresses, we get zillions of them, and then have to cross match these with supplier addresses, which becomes time-consuming. All of these are addressable [heh], (a) by confirming, where locations have the same names that they are the same - Perl will supply a chain, and we can compare the whole chain, creating two "The Mews"es if the match fails. This should also take into account the possibility of missing information in a match; (b) Numeric information could be extracted {We won't explore this!}; (c) If the chain doesn't end (upwards) in a Region, then query the administrator about the region to link to; (d) There are several solutions. I would be tempted to denormalise, and flag each terminal address that houses a supplier! Ugly but efficient, and not a great waster of space! We could, for example, have PlaceSize = 1 mean a non-dealer address, and 2 = a dealer.
The main purpose of the PlaceSize field is to be able to identify a place as a region or subregion!
TELNUM {includes area code} | |||
Name | Type | Size | Description |
TShort | decimal | 7,0 | {Generated by database. Up to 9 999 999 distinct numbers} |
AreaCode | decimal | 5,0 | {check maximum code size ???} |
TelNumber | decimal | 8,0 | {? just in case. 7 adequate. No "-" in number} |
Rather than getting into a complex mess with relating telephone numbers to addresses, we sidestep the issue, lump the two together, and give each combination a unique code. This solution may be far from optimal, and (like a lot of what we've done) can be criticised in a dozen ways. Oh well. At least we can easily change a (whole) telephone number, and if the area code for a whole region changes, we only do one table update with minimal fuss.
SUPPLIER | |||
Name | Type | Size | Description |
SupplierCode | decimal | 4,0 | Up to 9 999 suppliers. Internally generated code. |
SupplierName | varchar | 30 | The formal name of the supplier. |
SupplierEmail | varchar | 15 | Email of supplier, without "@appliance4u.com" on end. |
SupplierAddr | {As for PLACE.PlaceCode} |
The above table is fairly straightforward, as is the following which relates supplier and (one or more) telephone numbers.
SUPP_TEL | |||
Name | Type | Size | Description |
STelShort | {As for TELNUM.TShort} | ||
SuppCode | {As for SUPPLIER.SupplierCode} |
SUPP_AREA | |||
Name | Type | Size | Description |
SuppCode | {As for SUPPLIER.SupplierCode} | ||
AreaCode | {As for PLACE.PlaceCode} | ||
Status | decimal | 2,0 | {Relative weighting of supplier in this area. See text. Default = 49, Range is 0..99} |
The relationship between supplier and area is interesting. Here, we will constrain the regions - they must be a place-size that is a SUBREGION (PLACE.PlaceSize = 4). This arrangement allows us to link one supplier to many regions, and rank each supplier in each region (for example, based on customer feedback for that region).
|
CUSTOMER | |||
Name | Type | Size | Description |
CustomerCode | decimal | 7,0 | Up to 9 999 999 customers. Internally generated code. |
CustomerSurname | varchar | 30 | Self-explanatory |
CustomerTitle | varchar | 5 | Mr,Mrs,Miss etc. {check. Look for longer possibilities??} |
CustomerEmail | varchar | 30 | full Email of customer |
CustomerArea | {As for PLACE.PlaceCode; constrained to subregion PlaceSize=4} |
The above table is straightforward. See how we use PLACE which we defined previously.
CUST_TEL | |||
Name | Type | Size | Description |
cTelShort | {As for TELNUM.TShort} | ||
CustCode | {As for CUSTOMER.CustomerCode} |
CUST_TEL is so similar to SUPP_TEL that one is tempted to amalgamate the two. Hmm?
ENQUIRY {by customer of supplier about appliance} | |||
Name | Type | Size | Description |
ModlCode | {As for MODEL.ModelCode} | ||
CustCode | {As for CUSTOMER.CustomerCode} | ||
SuppCode | {As for SUPPLIER.SupplierCode} | ||
Price | decimal | 8,2 | Price max of 999 999.99 |
Date | float (!) | {no qualifier} | Date stamp of the e-mail! See note below. |
CustComment | varchar | 255 | Comment from customer; Perl script will have extracted any telephone number(s) |
Note the composite key. The price is the price that obtained when the enquiry was made, need not be identical to current price! One should probably augment the above table with fields for (a)Followed up (b)Sale made and (c)Sale price, with a Supplier Comment field, but we will leave this for future alteration, perhaps when/if we can get consistent supplier responses!
You might think it's a bit strange that we store a Date in a floating point number!! In fact this makes a lot of sense when you realise that every man and his dog (and certainly every SQL database creator) has defined a different standard. We will use a Julian date (simply a floating point number) to unequivocally and succinctly store the date and time.
{We won't here explore the conversion routines
required, but will later refer [xref] to our Date and Time web-page,
as well as "linking external function into SQL"}
4. Extending the model
It's tempting to now ask the question "How would the above model have
to be altered to accommodate menus on other levels?"
The other level menus are made up of fairly carefully formatted HTML
code, with few items on each page. The front page (Level 0) is unlikely
to be changed except through manual tweaking, and likewise for levels
1, 2, 3A, and 5. However, even these levels could be automated, with
templates that could be filled in with the relevant category information,
and an appropriate picture. The pictures could be stored in the
PICTURE table,
and a minor modification required there would be to create another
option for PICTURE.Usage. The problem is that PICTURE has a ModCode
field, while in this circumstance we would want to associate a picture
with a CATEGORY! This is probably sufficient reason to convince us
to remove the Modcodefrom the PICTURE table, and create a separate
MOD_PICTURE table. We can then store all pictures together, and have
one or more picture-associating tables, thus:
PICTURE2 {pictures of appliances} | |||
Name | Type | Size | Description |
PictureCode | decimal | 8,0 | Up to 99 999 999 distinct pictures, database-generated code |
PicName | varchar | 15 | actual name of picture, including .jpg or .gif suffix, but not the standard "images/" prefix ie path. |
PicAltTag | varchar | 30 | Description of the picture, as in HTML Alt tag |
PicWidth | decimal | 3,0 | pixel width of picture, up to 999 [?!] |
PicHeight | decimal | 3,0 | pixel height of picture, up to 999 [?!] |
PicUsage | char | 1 | big or thumbnail, coded as B or T. (? Option of C for Category) |
MODEL_PIC | |||
Name | Type | Size | Description |
ModCode | {As for MODEL.ModelCode} | ||
PicCode | {As for PICTURE2.PictureCode} |
What about Level 1A? At present these menus too are so small (and static) that manual alteration is an acceptable alternative to automation. However automation would be fun! Note that the detail menus for these items are the same as for other items, it's only the preceding menu that's a problem. Here, as for levels 1, 2, 3A and 5, a template could be filled with relevant text and pictures; a warning could be issued if the number of items exceeded the number of slots in the template (This overflow being determined by a Perl script).
Another thought: "What about having the HTML templates as part of the database too?" Although feasible, this seems excessive. Perl is doing the actual HTML generation, as this is what it's best at. My rule six comes to mind. We'll make text templates that Perl can handle, although we could of course generate these text templates from database images.
Yet another: "What about storing pictures in the database?" My personal view is that this is very silly. The images need to be stored on the web server anyway, and keeping them within the database is just going to bloat the database, and add another (vulnerable) step to an already complex process.
{ We won't here look into the trickery required to have, for example, Level 1 and 2 menus cross-referencing one another, for example, having a clickable link to "VCRs" from the main "TV" menu. The concept we need here is 'closely related categories', which is not encoded in our database at present. We could easily create a new table that associates related categories with one another! }
Why do we consider this to be wicked? Well, we think that wherever possible keys should be sequential. This means that there are no gaps in the numbers. You will soon appreciate that if we use external keys, no matter how smart the Perl script is that provides them, we have no hope of making one key number follow another, except under very unrealistic circumstances.
Now there are well-defined ways of generating unique sequential keys in most SQL implementations. Many of these break the rules embodied in SQL-92 and SQL-99. We will not break the rules, and (to keep things relatively simple) will generate our keys from the external data. This is not the best way to do things in real life, but remember, this whole big page is simply a teaching exercise. If we were to both stick to the SQL rules and use sequential keys, we'd have to introduce some fairly advanced things which we think are inappropriate at this stage.
{ later on we will have a page on keys using externally defined functions; will then have xref from here!}
Here is our table list:
All of the above are clickable, taking you to the above table structures.
Let's write the SQL code for each of these in turn. We won't burden each SQL definition with comments, as the table definitions and minor constraints we have attached should be self-explanatory. {Note that we could have added a whole lot more check constraints. We haven't. Naughty us!}
create table MANUFACTURER (
ManufacturerCode decimal(4,0) , constraint manuf_key1 primary key (ManufacturerCode), ManufacturerName varchar(20) , constraint manuf_name_null check (ManufacturerName is not null) ); |
create table CATEGORY (
CatCode decimal(4,0) , constraint category_key1 primary key (CatCode), CatParent decimal(4,0) default 0 , constraint category_parent_null check (CatParent is not null), constraint category_fgn_key foreign key (CatParent) references CATEGORY, CatName varchar(40) , constraint category_name_null check (CatName is not null), CatDesc varchar(64), CatAlign char(1) , constraint category_align_null check (CatAlign is not null) ); |
create table MODEL (
ModelCode decimal(8,0) , constraint model_key1 primary key (ModelCode), ModelName varchar(15) , constraint model_name_null check (ModelName is not null), ModelDesc varchar(64) , ModelParent decimal(4,0) , constraint model_parent_null check (ModelParent is not null), constraint model_parent_fgn_key foreign key (ModelParent) references CATEGORY, ManuCode decimal(6,0) , constraint model_manucode_null check (ManuCode is not null), constraint mocel_manucode_fgn_key foreign key (ManuCode) references MANUFACTURER, Price decimal(8,2) , constraint model_price_null check (Price is not null) ); |
create table PICTURE2 (
PictureCode decimal(8,0) , constraint picture_key1 primary key (PictureCode), PicName varchar(20) , constraint picture_name_null check (PicName is not null), PicAltTag varchar(30), PicWidth decimal(3,0) , constraint picture_width_null check (PicWidth is not null), PicHeight decimal(3,0) , constraint picture_hgt_null check (PicHeight is not null), PicUsage char(1) , constraint picture_use_null check (PicUsage is not null) ); |
create table FEATURE (
FeatureCode decimal(8,0) , constraint feature_key1 primary key (FeatureCode), FeatureType varchar(15) , constraint feature_type_null check (FeatureType is not null), FeatureText varchar(255) , constraint feature_text_null check (FeatureText is not null) ); |
create table MOD_FEATURE (
FeCode decimal(8,0), constraint mf_fecode_fgnkey foreign key (FeCode) references FEATURE, MoCode decimal(8,0), constraint mf_mocode_fgnkey foreign key (MoCode) references MODEL, PutMeAt decimal (1,0), constraint mf_putmfa_null check (PutMeAt is not null), MyNumber decimal(8,3) , Priority decimal(2,0) , constraint mf_priority_null check (Priority is not null), constraint mf_key1 primary key (FeCode, MoCode) ); |
create table MODEL_PIC (
ModCode decimal(8,0), constraint mp_modcode_fgnkey foreign key (ModCode) references MODEL, PicCode decimal(8,0), constraint mp_piccode_fgnkey foreign key (PicCode) references PICTURE2, constraint mp_key1 primary key (ModCode, PicCode) ); |
create table PLACE (
PlaceCode decimal(7,0) , constraint place_key1 primary key (PlaceCode), PlaceName varchar(25) , constraint place_name_null check (PlaceName is not null), PlaceIsIn decimal(7,0) , constraint place_in_null check (PlaceIsIn is not null), constraint place_in_fgnkey foreign key (PlaceIsIn) references PLACE, PlaceSize decimal(1,0) , constraint place_size_null check (PlaceSize is not null) ); |
create table TELNUM (
TShort decimal(7,0) , constraint telnum_key1 primary key (TShort), AreaCode decimal(5,0) , constraint telnum_area_null check (Areacode is not null), TelNumber decimal(8,0) , constraint telnum_num_null check (TelNumber is not null) ); |
create table SUPPLIER (
SupplierCode decimal(4,0) , constraint supp_key1 primary key (SupplierCode), SupplierName varchar(30) , constraint supp_name_null check (SupplierName is not null), SupplierEmail varchar(15) , constraint supp_email_null check (SupplierEmail is not null), SupplierAddr decimal(7,0) , constraint supp_addr_null check (SupplierAddr is not null), constraint supp_addr_fgnkey foreign key (SupplierAddr) references PLACE ); |
create table SUPP_TEL (
STelShort decimal(7,0), constraint st_telshort_fgnkey foreign key (STelShort) references TELNUM, SuppCode decimal(4,0), constraint st_supp_fgnkey foreign key (SuppCode) references SUPPLIER, constraint st_key1 primary key (STelShort, SuppCode) ); |
create table SUPP_AREA (
SuppCode decimal(4,0), constraint sa_suppcode_fgnkey foreign key (SuppCode) references SUPPLIER, AreaCode decimal(7,0), constraint sa_area_fgnkey foreign key (AreaCode) references PLACE, Status decimal(2,0) default 49 , constraint sa_area_null check (Status is not null), constraint sa_key1 primary key (SuppCode, AreaCode) ); |
create table CUSTOMER (
CustomerCode decimal(7,0) , constraint cust_key1 primary key (CustomerCode), CustomerSurname varchar(30) , constraint cust_surname_null check (CustomerSurname is not null), CustomerTitle varchar(5) , constraint cust_title_null check (CustomerTitle is not null), CustomerEmail varchar(30) , constraint cust_email_null check (CustomerEmail is not null), CustomerArea decimal(7,0) , constraint cust_area_null check (CustomerArea is not null), constraint cust_area_fgnkey foreign key (CustomerArea) references PLACE ); |
create table CUST_TEL (
cTelShort decimal(7,0), constraint ct_telshort_fgnkey foreign key (cTelShort) references TELNUM, CustCode decimal(7,0), constraint ct_custcode_fgnkey foreign key (CustCode) references CUSTOMER, constraint ct_key1 primary key (cTelShort, CustCode) ); |
create table ENQUIRY (
ModlCode decimal(8,0), constraint enq_modlcode_fgnkey foreign key (ModlCode) references MODEL, CustCode decimal(7,0), constraint enq_custcode_fgnkey foreign key (CustCode) references CUSTOMER, SuppCode decimal(4,0), constraint enq_suppcode_fgnkey foreign key (SuppCode) references SUPPLIER, Price decimal(8,2) , constraint enq_price_null check (Price is not null), EnqDate float , constraint enq_date_null check (EnqDate is not null), CustComment varchar(255), constraint enq_key1 primary key (ModlCode, CustCode, SuppCode) ); |
insert into CATEGORY (
CatCode, CatParent, CatName, CatDesc, CatAlign ) values ( 0, 0, 'APPLIANCE', 'The root category', 'X'); |
and also..
insert into PLACE (
PlaceCode, PlaceName, PlaceIsIn, PlaceSize) values ( 0, 'Southern Africa', 0, 7 ); |
Notice how we refer both 'APPLIANCE' and 'Southern Africa' back to themselves. This signals that we are at the 'root of the tree', but we must exercise caution as there is potential for infinite regress (we can look up the 'parent' and get back to the same spot if we write stupid code). The PlaceSize for our entry we set at 7, which is not one of the standard sizes we agreed upon, but follows logically after 6 (for regions) and still leaves space at the top (for eg. 'Africa' and 'World')!
It is important that CATEGORY is well-defined. It's all very well to just pull out the categories and subcategories from Perl generated data (We'll later come across these Category, .SubCategory, and .SubSubCat fields), but better would be to pre-define these in the CATEGORY table (more work, but more security!). At the same time we can link them vertically through the CATEGORY.CatParent field. Then any non-existent value IS an error. We will take this somewhat more tedious but rigorous approach..
insert into CATEGORY ( CatCode, CatParent, CatName, CatDesc, CatAlign ) values ( 1, 0, 'Microwaves', 'all', 'X'); insert into CATEGORY ( CatCode, CatParent, CatName, CatDesc, CatAlign ) values ( 2, 0, 'Ovens & Stoves', 'all', 'X'); insert into CATEGORY ( CatCode, CatParent, CatName, CatDesc, CatAlign ) values ( 3, 0, 'Fridges', 'all', 'X'); insert into CATEGORY ( CatCode, CatParent, CatName, CatDesc, CatAlign ) values ( 4, 0, 'Freezers', 'all', 'X'); insert into CATEGORY ( CatCode, CatParent, CatName, CatDesc, CatAlign ) values ( 5, 0, 'Washing Machines', 'all', 'X'); insert into CATEGORY ( CatCode, CatParent, CatName, CatDesc, CatAlign ) values ( 6, 0, 'Tumble Dryers', 'all', 'X'); insert into CATEGORY ( CatCode, CatParent, CatName, CatDesc, CatAlign ) values ( 7, 0, 'Dishwashers', 'all', 'X'); insert into CATEGORY ( CatCode, CatParent, CatName, CatDesc, CatAlign ) values ( 8, 0, 'Television Sets', 'all', 'X'); insert into CATEGORY ( CatCode, CatParent, CatName, CatDesc, CatAlign ) values ( 9, 0, 'Video Recorders', 'all', 'X'); insert into CATEGORY ( CatCode, CatParent, CatName, CatDesc, CatAlign ) values ( 10, 1, 'Small', 'Microwaves', 'X'); insert into CATEGORY ( CatCode, CatParent, CatName, CatDesc, CatAlign ) values ( 11, 1, 'Medium', 'Microwaves', 'X'); insert into CATEGORY ( CatCode, CatParent, CatName, CatDesc, CatAlign ) values ( 12, 1, 'Large', 'Microwaves', 'X'); insert into CATEGORY ( CatCode, CatParent, CatName, CatDesc, CatAlign ) values ( 13, 1, 'Extra Large', 'Microwaves', 'X'); insert into CATEGORY ( CatCode, CatParent, CatName, CatDesc, CatAlign ) values ( 14, 2, 'Extractor Hoods', 'oven and stove', 'X'); insert into CATEGORY ( CatCode, CatParent, CatName, CatDesc, CatAlign ) values ( 15, 2, 'Hobs & Combinations', 'oven and stove', 'X'); insert into CATEGORY ( CatCode, CatParent, CatName, CatDesc, CatAlign ) values ( 16, 2, 'Built in ovens', 'oven and stove', 'X'); insert into CATEGORY ( CatCode, CatParent, CatName, CatDesc, CatAlign ) values ( 17, 2, 'Stoves', 'oven and stove', 'X'); insert into CATEGORY ( CatCode, CatParent, CatName, CatDesc, CatAlign ) values ( 18, 2, 'Gas cooking including hobs and stoves', 'oven and stove', 'X'); insert into CATEGORY ( CatCode, CatParent, CatName, CatDesc, CatAlign ) values ( 19, 3, 'Gas Fridges', 'fridges', 'X'); insert into CATEGORY ( CatCode, CatParent, CatName, CatDesc, CatAlign ) values ( 20, 3, 'Fridges with a top freezer', 'fridges', 'X'); insert into CATEGORY ( CatCode, CatParent, CatName, CatDesc, CatAlign ) values ( 21, 3, 'Side by side fridge and freezer', 'fridges', 'X'); insert into CATEGORY ( CatCode, CatParent, CatName, CatDesc, CatAlign ) values ( 22, 3, 'Fridges: freezer compartment on the bottom', 'fridges', 'X'); insert into CATEGORY ( CatCode, CatParent, CatName, CatDesc, CatAlign ) values ( 23, 3, 'One Door Fridges', 'fridges', 'X'); insert into CATEGORY ( CatCode, CatParent, CatName, CatDesc, CatAlign ) values ( 24, 4, 'Gas Freezers', 'freezers', 'X'); insert into CATEGORY ( CatCode, CatParent, CatName, CatDesc, CatAlign ) values ( 25, 4, 'Smaller Chest Freezers: up to 220 L', 'freezers', 'X'); insert into CATEGORY ( CatCode, CatParent, CatName, CatDesc, CatAlign ) values ( 26, 4, 'Chest freezers 221 to 320 L', 'freezers', 'X'); insert into CATEGORY ( CatCode, CatParent, CatName, CatDesc, CatAlign ) values ( 27, 4, 'Large chest freezers', 'freezers', 'X'); insert into CATEGORY ( CatCode, CatParent, CatName, CatDesc, CatAlign ) values ( 28, 4, 'Upright freezers', 'freezers', 'X'); insert into CATEGORY ( CatCode, CatParent, CatName, CatDesc, CatAlign ) values ( 29, 5, 'Frontloading Washing machines', 'washing machines', 'X'); insert into CATEGORY ( CatCode, CatParent, CatName, CatDesc, CatAlign ) values ( 30, 5, 'Top loading washing machines', 'washing machines', 'X'); insert into CATEGORY ( CatCode, CatParent, CatName, CatDesc, CatAlign ) values ( 31, 5, 'Twin Tubs', 'washing machines', 'X'); insert into CATEGORY ( CatCode, CatParent, CatName, CatDesc, CatAlign ) values ( 32, 8, 'Television sets 36 to 44 cm', 'tv', 'X'); insert into CATEGORY ( CatCode, CatParent, CatName, CatDesc, CatAlign ) values ( 33, 8, 'Television Sets: 51 to 55 cm', 'tv', 'X'); insert into CATEGORY ( CatCode, CatParent, CatName, CatDesc, CatAlign ) values ( 34, 8, 'Television Sets: 63 to 65 cm', 'tv', 'X'); insert into CATEGORY ( CatCode, CatParent, CatName, CatDesc, CatAlign ) values ( 35, 8, 'Television Sets: 70 to 75 cm', 'tv', 'X'); insert into CATEGORY ( CatCode, CatParent, CatName, CatDesc, CatAlign ) values ( 36, 8, 'Extra large Television Sets', 'tv', 'X'); insert into CATEGORY ( CatCode, CatParent, CatName, CatDesc, CatAlign ) values ( 37, 9, 'Double head Video Cassette Recorders', 'vcr', 'X'); insert into CATEGORY ( CatCode, CatParent, CatName, CatDesc, CatAlign ) values ( 38, 9, 'Four head Video Cassette Recorders', 'vcr', 'X'); insert into CATEGORY ( CatCode, CatParent, CatName, CatDesc, CatAlign ) values ( 39, 9, 'Six head VCRs', 'vcr', 'X'); insert into CATEGORY ( CatCode, CatParent, CatName, CatDesc, CatAlign ) values ( 40, 20, 'Fridges with a top freezer: 200 to 299 L', 'top freezer fridge', 'X'); insert into CATEGORY ( CatCode, CatParent, CatName, CatDesc, CatAlign ) values ( 41, 20, 'Fridges with top freezers: 300 to 399 L', 'top freezer fridge', 'X'); insert into CATEGORY ( CatCode, CatParent, CatName, CatDesc, CatAlign ) values ( 42, 20, 'Fridges with top freezer: 400 to 549 L', 'top freezer fridge', 'X'); insert into CATEGORY ( CatCode, CatParent, CatName, CatDesc, CatAlign ) values ( 43, 20, 'Large top freezer fridges', 'top freezer fridge', 'X'); insert into CATEGORY ( CatCode, CatParent, CatName, CatDesc, CatAlign ) values ( 44, 22, 'Bottom freezer fridges under 300 L', 'bottom freezer fridge', 'X'); insert into CATEGORY ( CatCode, CatParent, CatName, CatDesc, CatAlign ) values ( 45, 22, 'Bottom freezer fridges 300 to 349 L', 'bottom freezer fridge', 'X'); insert into CATEGORY ( CatCode, CatParent, CatName, CatDesc, CatAlign ) values ( 46, 22, 'Bottom freezer fridges 350 to 399 L', 'bottom freezer fridge', 'X'); insert into CATEGORY ( CatCode, CatParent, CatName, CatDesc, CatAlign ) values ( 47, 22, 'Larger bottom freezer fridges: 400+ L', 'bottom freezer fridge', 'X'); |
Whew! That took longer than I thought it would (thank heavens for cut and paste)! Note that as things stand, there is redundant (and denormalised) information due to similar CatDesc fields with similar CatParents, but the CatDesc descriptive fields can contain anything you want!
Again, it's infinitely preferable to pre-define the Regions, and probably also the Subregions before we start capturing data. This is a manageable task and will diminish errors substantially. Let's do what we did for category. First we will do regions, then subregions.
insert into PLACE
(PlaceCode, PlaceName, PlaceIsIn, PlaceSize)
values
(1, 'Eastern Cape', 0, 6);
insert into PLACE (PlaceCode, PlaceName, PlaceIsIn, PlaceSize) values (2, 'Free State', 0, 6); insert into PLACE (PlaceCode, PlaceName, PlaceIsIn, PlaceSize) values (3, 'Gauteng', 0, 6); insert into PLACE (PlaceCode, PlaceName, PlaceIsIn, PlaceSize) values (4, 'KwaZulu Natal', 0, 6); insert into PLACE (PlaceCode, PlaceName, PlaceIsIn, PlaceSize) values (5, 'Mpumalanga', 0, 6); insert into PLACE (PlaceCode, PlaceName, PlaceIsIn, PlaceSize) values (6, 'Namibia', 0, 6); insert into PLACE (PlaceCode, PlaceName, PlaceIsIn, PlaceSize) values (7, 'North West', 0, 6); insert into PLACE (PlaceCode, PlaceName, PlaceIsIn, PlaceSize) values (8, 'Northern Cape', 0, 6); insert into PLACE (PlaceCode, PlaceName, PlaceIsIn, PlaceSize) values (9, 'Northern Province', 0, 6);
insert into PLACE (PlaceCode, PlaceName, PlaceIsIn, PlaceSize)
values (10, 'Western Cape', 0, 6);
|
See how we've linked the subregions to the regions. There is a lot of redundancy, and in fact we could clip the region names out of the subregions without further ado, provided we make allowances in our Perl scripts for this alteration.
{ALSO WHAT ABOUT CASE SENSITIVITY???}
create table BIG_FLAT_APPL (
Category varchar (30) , PicType varchar (1) , Brand varchar (20) , Model varchar (15) , Price decimal (8,2) , ThumbName varchar (15) , ThumbW decimal (3,0) , ThumbH decimal (3,0) , ThumbAlt varchar (30) , PixName varchar (15) , PixWidth decimal (3,0) , PixHeight decimal (3,0) , PixAlt varchar (30) , Summ1 varchar (40) , Summ2 varchar (40) , Summ3 varchar (40) , Summ4 varchar (40) , Summ5 varchar (40) , Summ6 varchar (40) , Summ7 varchar (40) , Summ8 varchar (40) , Summ9 varchar (40) , Summ10 varchar (40) , Left1 varchar (255) , Left2 varchar (255) , Left3 varchar (255) , Left4 varchar (255) , Left5 varchar (255) , Left6 varchar (255) , Left7 varchar (255) , Left8 varchar (255) , Left9 varchar (255) , Left10 varchar (255) , Left11 varchar (255) , Left12 varchar (255) , Left13 varchar (255) , Left14 varchar (255) , Left15 varchar (255) , Left16 varchar (255) , Left17 varchar (255) , Left18 varchar (255) , Left19 varchar (255) , Left20 varchar (255) , Left21 varchar (255) , Left22 varchar (255) , Left23 varchar (255) , Left24 varchar (255) , Left25 varchar (255) , Left26 varchar (255) , Left27 varchar (255) , Left28 varchar (255) , Left29 varchar (255) , Left30 varchar (255) , Bott1 varchar (255) , Bott2 varchar (255) , Bott3 varchar (255) , Bott4 varchar (255) , Bott5 varchar (255) , Bott6 varchar (255) , Bott7 varchar (255) , Bott8 varchar (255) , Bott9 varchar (255) , Bott10 varchar (255) , Bott11 varchar (255) , Bott12 varchar (255) , Bott13 varchar (255) , Bott14 varchar (255) , Bott15 varchar (255) , Seq decimal(8) ); |
create table BIG_FLAT_DEAL (
Region varchar (15), Subregion varchar (25), SupplierName varchar (30), SupplierAddr1 varchar (30), SupplierAddr2 varchar (30), SupplierAddr3 varchar (30), SupplierEmail varchar (15), SupplierTelCode decimal (6), SupplierTelNum1 decimal (10), SupplierTelNum2 decimal (10), SupplierTelNum3 decimal (10), Seq decimal (8) ); |
{ ??? if supplier gives a cellphone number???}
create table BIG_FLAT_CUST (
CustTitle varchar(5), CustSurname varchar(30), CTelCodeA decimal (6), CTelCodeB decimal (6), CTelNumA decimal (10,0), CTelNumB decimal (10,0), CEMail varchar(30), CustArea varchar(25), CApplCode varchar(15), CPrice decimal (8,2), CSupplier varchar(25), CustComm varchar(255), CDate float, Seq decimal(8,0) ); |
{NOTE: need to fix up scripts iro CtelNum1 etc}
insert into BIG_FLAT_CUST (
CustTitle, CustSurname, CTelCodeA, CTelCodeB, CTelNumA, CTelNumB, TelNum2, CEMail, CustArea, CApplCode, CPrice, CSupplier, CustComm, CDate, Seq ) values ( 'Ms', 'van der Merwe', 083, 01234, 9997654, 5551234, 'me@somebodyinsa.co.za', 'Gauteng Pretoria', 'Defy 631t', 2716.00, 'golddollpta', 'I want it and i want it now!', 0000000.0, 1 ); |
Next, dealer information..
insert into BIG_FLAT_DEAL (
Region, Subregion, SupplierName, SupplierAddr1, SupplierAddr2, SupplierAddr3, SupplierEmail, SupplierTelCode, SupplierTelNum1, SupplierTelNum2, SupplierTelNum3 ) values ( 'Gauteng', 'Pretoria', 'Gold Dollar Furnishers', '211 Du Toit Street', 'Pretoria', '', 'golddollpta', 012, 3227766, 3227767, 3227768 ); |
And finally, the most important of the lot..
insert into BIG_FLAT_APPL (
Category, PicType, Brand, Model, Price, ThumbName, ThumbW, ThumbH, ThumbAlt, PixName, PixWidth, PixHeight, PixAlt, Summ1, Summ2, Summ3, Summ4, Summ5, Summ6, Summ7, Summ8, Summ9, Summ10, Left1, Left2, Left3, Left4, Left5, Left6, Left7, Left8, Left9, Left10, Left11, Left12, Left13, Left14, Left15, Left16, Left17, Left18, Left19, Left20, Left21, Left22, Left23, Left24, Left25, Left26, Left27, Left28, Left29, Left30, Bott1, Bott2, Bott3, Bott4, Bott5, Bott6, Bott7, Bott8, Bott9, Bott10, Bott11, Bott12, Bott13, Bott14, Bott15, Seq ) values ( 'Fridges with a top freezer: 200 to 299 L', 'V', 'KIC', 'kt5021a', 1730.00, 'kt5021a_s.jpg', 41, 61, 'KIC 5021a fridge', 'kt5021a.jgp', 119, 251, 'kt5021a - large image', '215 L', 'all-round magnetic seals', '', '', '', '', '', '', '', '', 'total capacity 215 L', 'fridge 143 L', 'freezer 72L', 'Powder coated galvanized steel inner and outer cabinets', 'Rust-resistant', 'Electrostatic paint finish', 'Modern-looking smooth finish', 'Locks on both doors', 'Curved door soft-line finish', '100% CFC free and environment friendly', 'Adjustable front feet compensate for uneven floors', 'One piece galvanized steel expoxy-coated inner liners', 'Magnetic gaskets on all four edges of doors seal doors tightly keeping cold air inside', 'Adjustable steel shelves for individual packing needs', 'Adjustable thermostat in fridge section for different climatic needs', 'Interior lights', 'Cyclomatic defrost in fridge section', 'Large crisper drawer', 'Condiment and butter compartments in fridge', '', '', '', '', '', '', '', '', '', '', '', 'FREEZER SECTION', 'Magnetic gaskets for a tight seal, as for fridge', 'GUARANTEE - One year guarantee (misuse excluded)', 'Two-year warranty on working parts', 'DIMENSIONS (mm)', '1423', '500', '593', '', '', '', '', '', '', '', 1 ); |
The above insert statement is so unwieldy that only automatic generation makes sense. Once the commands are generated, we will run them within SQL to insert the data!
Data Movement | |
Move From | .. To |
BIG_FLAT_APPL | MANUFACTURER |
BIG_FLAT_APPL | CATEGORY |
BIG_FLAT_APPL | MODEL |
BIG_FLAT_APPL | PICTURE2 |
BIG_FLAT_APPL | MODEL_PIC |
BIG_FLAT_APPL | FEATURE |
BIG_FLAT_APPL | MOD_FEATURE |
BIG_FLAT_DEAL | PLACE |
BIG_FLAT_DEAL, BIG_FLAT_CUST | TELNUM |
BIG_FLAT_DEAL | SUPPLIER |
BIG_FLAT_DEAL | SUPP_TEL |
BIG_FLAT_DEAL | SUPP_AREA |
BIG_FLAT_CUST | CUSTOMER |
BIG_FLAT_CUST | CUST_TEL |
BIG_FLAT_CUST | ENQUIRY |
Note that there are multiple fields for each of the above 'summary transactions'. We will examine and implement each of these transactions in a separate section below.
NOTE: In the following, to make things more legible, we will refer to BIG_FLAT APPLIANCE, BIG_FLAT_DEAL and BIG_FLAT_CUST as BFA, BFD and BFC respectively! {LOOK UP HOW TO ALIAS INTELLIGENTLY}.
We will sneakily get our unique number from the unique Sequence number of the appliance!! Remember that we are doing this to keep things both simple and SQL-standard compliant, as a teaching exercise. Much later on, we'll find out more elegant ways of stamping in a unique (and sequential) number. Here goes..
insert into MANUFACTURER (ManufacturerName, ManufacturerCode) select Brand, min(Seq) from BIG_FLAT_APPL group by BRAND; |
See how we used the insert command to
copy from one table to another. We group
by Brand to identify individual manufacturers, and then give each a
unique but non-sequential number. (Clumsy!)
select Category, Brand, Model from BIG_FLAT_APPL where Category not in (select CatName from CATEGORY); |
Our basic approach is as follows:
insert into MODEL
(Price, ModelCode, ModelName, ModelParent, ManuCode, ModelDesc) select Price, Seq, Model, CatCode, ManufacturerCode, '' from BIG_FLAT_APPL, CATEGORY, MANUFACTURER where Category = CatName AND Brand = ManufacturerName; |
Note several things about the above statement:
Note that because {--for the time being, to update--} we have abandoned the use of the KEYSOURCES table to generate keys, and are instead using BFA.Seq, we have a problem with pictures, for each row provides two of them. We will use a very ugly fix.
insert into PICTURE2
(PictureCode, PicName, PicAltTag, PicWidth, PicHeight, PicUsage) select 2*Seq, ThumbName, ThumbAlt, ThumbW, ThumbH, 'T' from BIG_FLAT_APPL;
insert into PICTURE2
|
We create unique picture key codes for the thumbnails from the sequence number*2, or add 1 to this for the main pictures!
As described for PICTURE2, the codes 'T' and 'B' refer to thumbnail sized pictures and big pictures, respectively.
insert into MODEL_PIC (ModCode, PicCode) select ModelCode, PictureCode from BIG_FLAT_APPL, MODEL, PICTURE2 where Model = ModelName AND PixName = PicName; |
The idea is that we join the three tables mentioned, extracting lines where the Model matches the model we previously inserted into MODEL, and the picture BFA.PixName matches PICTURE.PicName, inserted in the previous section. We now have to do the same for the thumbnail pictures:
insert into MODEL_PIC (ModCode, PicCode) select ModelCode, PictureCode from BIG_FLAT_APPL, MODEL, PICTURE2 where Model = ModelName AND ThumbName = PicName; |
create table FE1 ( FeCode1 decimal(8,0), FeText1 varchar(255), FeMod1 decimal(8,0), FePos1 char(1), FeStatus1 decimal(2) ); |
We will give just three examples of the sort of insert statement we need, but something similar must be done for all 55 columns!!
insert into FE1 (FeCode1, FeText1, FeMod1, FePos1, FeStatus1)
select 0 + 55 * Seq,
Summ1,
ModelCode,
'S',
0
from BIG_FLAT_APPL, MODEL
where BIG_FLAT_APPL.Model = MODEL.ModelName
AND Summ1 <> '' ;
insert into FE1 (FeCode1, FeText1, FeMod1, FePos1, FeStatus1) select 10 + 55 * Seq, Left1, ModelCode, 'L', 0 from BIG_FLAT_APPL, MODEL where BIG_FLAT_APPL.Model = MODEL.ModelName AND Left1 <> '' ; insert into FE1 (FeCode1, FeText1, FeMod1, FePos1, FeStatus1) select 40 + 55 * Seq, Bott1, ModelCode, 'B', 0 from BIG_FLAT_APPL, MODEL where BIG_FLAT_APPL.Model = MODEL.ModelName AND Bott1 <> '' ; |
Okay, there's some trickery here:
Next, we must pull out all features, and retain a unique code for each. We (again, clumsily) do this by:
insert into FEATURE (FeatureCode, FeatureType, FeatureText)
select min(FeCode1), '', FeText1 from FE1 group by FeText1; |
{Note that we've copped out and left the FeatureType field a null string}
update FE1 set FePos1 = '1' where FePos1 = 'S';
update FE1 set FePos1 = '2' where FePos1 = 'L'; update FE1 set FePos1 = '4' where FePos1 = 'B'; create table MODX ( modcodeX decimal (8,0), featcodeX decimal (8,0), featposX decimal (1,0), fepriorX decimal(1,0)); insert into MODX (modcodeX, featcodeX, featposX, fepriorX) select FeMod1, FeatureCode, cast(FePos1 as decimal), FeStatus1 from FE1, FEATURE where FeText1 = FeatureText; insert into MOD_FEATURE (FeCode, MoCode, PutMeAt, Priority) select featcodeX, modcodeX, sum(featposX), min(fepriorX) from MODX group by featcodeX, modcodeX; drop table FE1; |
Some more trickery here, and a tradeoff. We've chosen to represent PutMeAt as a decimal, allowing us to have one entry in MOD_FEATURE with several meanings. For example, say we have a video recorder with the same feature in both the summary menu and the detail menu. If we were to have two entries in MOD_FEATURE, we would have to include PutMeAt as another component of the key, or create another table. We cop out and allow the PutMeAt field to tell us that the item is represented in two places. The tradeoff is that the Priority field potentially has multiple values (and can you see the knock-on effect of this..?).
Our implementation above reflects our decision. We group by Appliance code and feature, and then add up all the "PutMeAts" which we conveniently turn into numeric codes (Note the use of cast)! We also find the lowest priority, and attach this priority to the row!
See that here we DO leave MOD_FEATURE.MyNumber as NULL.
{This is indeed
wicked, but inserting an arbitrary number into the field would
be more wicked! }
Notice how we clean up, removing the temporary tables FE1 and FE2.
We now have to do the BIG_FLAT_DEAL thing! The relevant fields are
Region and Subregion (which should already exist, or force an error),
and then SupplierAddr1 and SupplierAddr2. Note that (for our purposes)
we will assume that SupplierAddr1 is unique.
{This is a gross
over-simplification}.
It is
important that we realise that SupplierAddr2 may be the same as the
subregion! Let's make allowances for the possibility
that SupplierAddr2 = Subregion in our data insertion!
select Region, SubRegion, SupplierName from BIG_FLAT_DEAL where (Region not in select PlaceName from PLACE where PlaceSize = 6) OR (SubRegion not in select PlaceName from PLACE where PlaceSize = 4); |
The script is easily understood - if the dealer Region or Subregion isn't already in our tables in the appropriate place, add it to a list of duds. Give us the supplier name too, so we can easily find the erroneous item. If the select statement returns any item at all, it's a dud!
insert into PLACE (PlaceCode, PlaceName, PlaceIsIn, PlaceSize) select 2 * BIG_FLAT_DEAL.Seq, SupplierAddr2, PlaceCode, 3 from BIG_FLAT_DEAL, PLACE where Subregion = PLACE.PlaceName AND SupplierAddr2 not in (select PlaceName from PLACE where PlaceSize = 4); |
We do a cross join of BIG_FLAT_DEAL on PLACE, and insert the relevant row (with it's associated parent PLACE code). The value of 3 for PlaceSize is a convenience that simply says "it's in between a SubRegion and an address". The final AND condition is simply to prevent the case where SupplierAddr2 is already represented as a SubRegion (PlaceSize = 4)! See how the placecode is created from BFD.Seq. Compare this with the next section - we now do something very similar for SupplierAddr1:
insert into PLACE (PlaceCode, PlaceName, PlaceIsIn, PlaceSize) select 1 + 2 * BIG_FLAT_DEAL.Seq, SupplierAddr1, PlaceCode, 1 from BIG_FLAT_DEAL, PLACE where SupplierAddr2 = PLACE.PlaceName; |
See how we generate a unique code for PlaceCode from BFD.Seq! We've chosen the value of 1 for PlaceSize - this leaves us leeway to later insert another code of 2 for an intermediate line in the address! { Note that the implicit assumption that the name SupplierAddr2 can never be the same as an item in PLACE with a PlaceSize of six}.
We defer the "BIG_FLAT_CUST -> TELNUM" transaction until
later.
insert into TELNUM (TShort, AreaCode, TelNumber)
select 3 * Seq,
SupplierTelCode,
SupplierTelnum1
from BIG_FLAT_DEAL;
insert into TELNUM (TShort, AreaCode, TelNumber) select 1 + 3 * Seq, SupplierTelCode, SupplierTelnum2 from BIG_FLAT_DEAL where SupplierTelnum2 <> ''; insert into TELNUM (TShort, AreaCode, TelNumber) select 2 + 3 * Seq, SupplierTelCode, SupplierTelnum3 from BIG_FLAT_DEAL where SupplierTelnum3 <> ''; |
See how we exclude the cases where a null string has been provided
for the second and third telephone numbers.
{There is a potential
problem of redundancy where say a customer works for a Dealer and provides
the same telephone number, which won't be picked up in the above.
}.
insert into SUPPLIER (SupplierCode, SupplierName, SupplierEmail, SupplierAddr) select Seq, BIG_FLAT_DEAL.SupplierName, BIG_FLAT_DEAL.SupplierEmail, PlaceCode from BIG_FLAT_DEAL, PLACE where SupplierAddr1 = PlaceName; |
We've already done all the setup of SupplierAddr1 and its chain of association. Here we actually create a dealer row. We need the inner join to pick out the Placecode.
insert into SUPP_TEL (STelShort, SuppCode) select TShort, SupplierCode from BIG_FLAT_DEAL, SUPPLIER, TELNUM where BIG_FLAT_DEAL.SupplierName = SUPPLIER.SupplierName AND SupplierTelCode = TELNUM.AreaCode AND (SupplierTelNum1 = TELNUM.TelNumber OR SupplierTelNum2 = TELNUM.TelNumber OR SupplierTelNum3 = TELNUM.TelNumber); |
insert into SUPP_AREA (SuppCode, AreaCode, Status) select SupplierCode, PlaceCode, 49 from BIG_FLAT_DEAL, SUPPLIER, PLACE where BIG_FLAT_DEAL.SupplierName = SUPPLIER.SupplierName AND BIG_FLAT_DEAL.SubRegion = PLACE.PlaceName; |
Okay, we put in the 49 anyway, more out of pedantry than for any other good reason.
{ What happens ????????? if we have a dealer represented in more than one place. CHECK THIS OUT CAREFULLY = HAVE TO ENSURE THAT DEALER ENTRY DOESN'T EXIST, AND IF IT DOES, DON'T DUPLICATE IT !!! Could be complex with a hierarcy of dealers or a central dealer controlling several stores?}
insert into CUSTOMER (CustomerCode, CustomerSurname, CustomerTitle, CustomerEmail, CustomerArea) select Seq, CustSurname, CustTitle, CEmail, PlaceCode from BIG_FLAT_CUST, PLACE where CustArea = PLACE.PlaceName AND PLACE.PlaceSize = 4; |
We ensure that the customer area supplied is a SubRegion, and that it exists, upon which we insert the PlaceCode. But NOTE - The above makes no check as to whether the customer exists. Let's try:
create table CUS1 (
CCode1 decimal(7,0),
CSurname1 varchar(30),
CTitle1 varchar(5),
CEmail1 varchar(30),
CArea1 decimal(7,0),
CCodeA1 decimal(5),
dCodeB1 decimal(5),
CTelA1 decimal(8),
CTelB1 decimal(8),
Price1 decimal(8,2),
Date1 float,
Model1 varchar(15),
Suppl1 varchar(25)
);
insert into CUS1 (CCode1, CSurname1, CTitle1, CEmail1, CCodeA1, CCodeB1, CTelA1, CTelB1, CArea1, Price1, Date1, Model1, Suppl1) select Seq, CustSurname, CustTitle, CEmail, CTelCodeA, CTelCodeB, CTelNumA, CTelNumB, PlaceCode, CPrice, CDate, CApplCode, CSupplier from BIG_FLAT_CUST, PLACE where CustArea = PLACE.PlaceName AND PLACE.PlaceSize = 4; create table CUS2 ( OLDcode decimal(7,0), CCode2 decimal(7,0), CSurname2 varchar(30), CTitle2 varchar(5), CEmail2 varchar(30), CArea2 decimal(7,0), CCodeA2 decimal(5), dCodeB2 decimal(5), CTelA2 decimal(8), CTelB2 decimal(8) Price2 decimal(8,2), Date2 float, Model2 varchar(15), Suppl2 varchar(25) ); insert into CUS2 (CCode2, CSurname2, CTitle2, CEmail2, CCodeA2, CCodeB2, CTelA2, CTelB2, CArea2, OLDcode, Price2, Date2, Model2, Suppl2) select CCode1, CSurname1, CTitle1, CEmail1, CCodeA1, CCodeB1, CTelA1, CTelB1, CArea1, CustomerCode Price1, Date1, Model1, Suppl1 from CUS1 left join CUSTOMER on CUS1.CEmail1 = CUSTOMER.CustomerEmail update CUS2 set CCode2 = OLDcode where OLDcode is not NULL; |
We do what we did in section 6. above, so that we now have replaced new customer numbers by old ones if they exist!
{
NB. This is very simplistic. The match to old
customer depends on the customer's email being the same.
Very clusmy, but we are not going to (YET) go into the complexity
of other matching!
WHAT ABOUT NEW TEL NUMBER FOR OLD CUSTOMER?!
insert into TELNUM (TShort, AreaCode, TelNumber)
select 2 * CCode2,
CCodeA2,
CTelA2
from CUS2
where CCodeA2 || CTelA2
not in (select AreaCode || TelNumber
from TELNUM);
insert into TELNUM (TShort, AreaCode, TelNumber) select 1 + 2 * CCode2, CCodeB2, CTelB2 from CUS2 where CCodeB2 || CtelB2 not in (select AreaCode || TelNumber from TELNUM); insert into CUST_TEL (cTelShort, CustCode) select TShort, CCode2 from CUS2, TELNUM where CCodeA2 = AreaCode AND CTelA2 = TelNumber; insert into CUST_TEL (cTelShort, CustCode) select TShort, CCode2 from CUS2, TELNUM where CTelB2 is not NULL AND CCodeB2 = AreaCode AND CTelB2 = TelNumber; drop table CUS1; |
The first insert statement takes out all the telephone numbers
(with their area codes) from CUS2, provided that the codes
don't already exist within the TELNUM table. We use the clumsy concatenation
because otherwise we would have difficulty if by chance two customers had
the same telephone number but different area codes!
{Note that there is still a problem!
If a customer has changed their email address, they will be seen
as a new customer. We still need to link them to their own unique
telephone number that already (should) exist within the database.
}
The second insert statement does exactly the same for the second telephone code+number combination in CUS2.
The last two insert statements associate the customer number (either old or new) with one or two telephone numbers. We then drop the temporary table CUS1, but keep CUS2 for later!
insert into ENQUIRY (ModlCode, CustCode, SuppCode, Price, Date, CustComment)
select
ModelCode,
CCode2,
SupplierCode,
Price2,
Date2,
CustComm
from CUS2, MODEL, SUPPLIER
where
Suppl2 = SupplierName
AND
Model2 = ModelName;
drop table CUS2; |
Note how Price2, Model2, Date2 and Suppl2, previously defined as apparently unnecessary components of CUS2, finally come into their own!
{We have a potential problem if an appliance has been dropped from the database after the customer has ordered it?! What to do??}
We therefore turn to the task of extracting data tables that will be useful to Perl in recreating HTML pages for the appliance4u website:
{here have scripts!!}