Ladies and Gentlemen
So here is the issue I am facing. My company manages the networks of several other companies. We are constantly adding more companies to our client list. We use a number of different tools to monitor the health of the networks we monitor, Each of these tools reports to consoles located one a server system we place on our client networks. We collect data from antivirus, active directory, windows update services, a web based monitoring tool, and a disk health tool.
Each of these tools categorizes the data it collects based on the computer name which simplifies the data aggregation when done by hand. Each of these tools also exports to a .csv or .xls format allowing for relatively easy importation.
I have been assigned the task of building a database to automate the aggregation that is needed and thus drastically reducing the time needed inspect for problems on the client network. However I have never designed a database such as this before and I am finding it exceedingly difficult to discover the proper layout for tables and relationships in order to ensure we have the data we need and that it is being properly imported to ensure that all problems on the client network are identified.
Each tool exports as follows:
AD:
Computer name
modified date
discription
AV/AS (we use several different AV utilities but they all have these common fields):
Computer Name
Scan date
av def #
as def #
last report date
and several other fields
WSUS:
computer name
number of updates needed
number of updates failed
last report date
a web report agent: (my boss would like this to be the "primary index" of all computers we manage, as all units regardless of company report to this tool)
company name
computer name
agent version number
and several other fields
disk status tool:
computer name
report date
tool version
drive letter
(each computer name typically has more than one drive letter, and each drive letter has several fields associated with it when this data is imported all the previous fields are properly enumerated.)
Now there are two problems with the various tools. At any given time one tool may contain data on a computer none of the others know of. This happens for many reasons but primarily because when old systems are retired, they may not always be removed properly, and/or the data may not be cleared for whatever reason. Secondly each tool can have gaps in its data, IE a date field may be left blank or only have a "-" in it if the unit has never reported in.
Now I can get the data to import, and I have found various ways to link the tables together. The issue that I am having is how exactly I can import the .csv files given the amount of data replication it creates, and maintain referential integrity.
I would greatly welcome any advise or help anyone can offer me.