Occupation Injuries and Illness Incidences Rate Data (SI) si.txt Section Listing 1. Survey Definition 2. FTP files listed in the survey directory. 3. Time series, series file, data file, & mapping file definitions and relationships 4. Series file format and field definitions 5. Data file format and field definitions 6. Mapping file formats and field definitions 7. Data Element Dictionary ================================================================================ Section 1 ================================================================================ The following is a definition of: OCCUPATIONAL INJURIES AND ILLNESSES INCIDENCES RATE DATA (SI) Survey Description: The occupational injury and illness incidence rate is an annual measure of incidence of work-related injuries and illnesses and is in the form of the number of injuries and illnesses, or lost workdays per 100 full-time employees. For this purpose, 200,000 employee hours represent 100 employee years. Data collected through the annual survey are based on records which employers maintain under the Occupational Safety and Health Act. The data include all cases resulting from work accidents or exposure in the work environment which result in death, nonfatal illness, or nonfatal injury which involves medical treatment (beyond first aid), loss of consciousness, restriction of work or motion, or transfer to another job. Virtually the entire private sector is covered by the survey, except self-employed individuals, farms with fewer than 11 employees, and employers regulated by other Federal safety and health laws. Federal, State, and local government agencies also are excluded. Data conforming to definitions of recordable occupational injuries and illnesses for coal, metal and nonmetal mining, and railroad transportation are provided by the Mine Safety and Health Administration, U.S. Department of Labor, and the Federal Railroad Administration, U.S. Department of Transportation. The incidence rates are produced for industries which are based on the 1987 Standard Industrial Classification (SIC). The survey sample design uses stratified random sample with a Neyman allocation. The characteristics used to stratify the units are the States, SIC code, and employment. The sampling ratios at the various employment size classes range from all units above a certain size class selected with certainty through declining proportions in each smaller employment-size class. The data for all reporting units in each industry are expanded by the inverse of the sampling ratio, and benchmarked to the appropriate employment level in each industry. Reports are collected from about 180,000 sample units. Summary Data Available: The incidence rates and numbers of cases are calculated for three categories: (1) injury and illness combined, (2) injury only; and (3) illness only. The incidence rates and numbers of cases for each category are available at the 2-digit SIC industry level in agriculture, forestry, and fishing, the 3-digit level in oil and gas extraction, construction, transportation and public utilities, wholesale and retail trade, and finance, insurance, and real estate, and services; and the 4-digit level in manufacturing. Estimates of incidence rates and numbers of cases for industries are also made for various severity classifications. For each industry, the incidence rates and numbers of cases are available for total nonfatal recordable cases; cases with days away from work, job transfer or restriction; cases with days away from work; cases with job transfer or restriction only; and other recordable cases. Additionally, estimates of incidence rates and numbers of cases for industries are made for various illness categories: skin diseases or disorders, respiratory conditions, poisonings, and all other illnesses. The estimating procedure generates occupational injury and illness estimates for approximately 835 SIC codes. This dataset, however, excludes estimates for several industry codes if one of the following situations occurred: 1. Data were suppressed to protect the confidentiality of respondents. 2. Annual average employment for the industry was fewer than 10,000. However, estimates for an industry with an annual average employment of less than 10,000 were published if the majority of the employment was reported in the survey. 3. The relative standard error for the estimate was above a limit based on the number of cases. 4. The benchmark factor for the industry was less than 0.90 or greater than 1.49. Data for an unpublished industry were included in the total for the broader industry level of which it is a part. Calculation of Incidence Rates: 1. The incidence rates represent the number of injuries and/or illnesses or lost workdays per 100 full-time workers and were calculated as: (N/EH) x 200,000 (20,000,000 for illness rates). where: N = number of injuries and/or illnesses or lost workdays. EH = total hours worked by all employees during the calendar year. 200,000 = base for 100 full-time equivalent workers (working 40 hours/week, 50 weeks/year). 20,000,000 =base for 10,000 full-time equivalent workers (working 40 hours/week, 50 weeks/year). 2. Average lost workdays are calculated as: Total lost workdays/total lost workday cases. Frequency of Observations: All data are annual. Data Characteristics: Rates are stored to one decimal place. Number of cases are in thousands. References: BLS Handbook of Methods, Chapter 9, "Occupational Safety and Health Statistics", 1997 edition. ================================================================================== Section 2 ================================================================================== The following Occupation Injuries and Illness Incidences Rate Data files are on the BLS internet in the sub-directory pub/time.series/si: si.case.type - Case type codes mapping file si.contacts - Contacts for si survey si.data.1.AllData - All data si.data.type - Data type codes mapping file si.division - Division codes mapping file si.footnote - Footnote codes mapping file si.industry - Industry codes mapping file si.period - Period codes mapping file si.series - All series and their beginning and end dates si.txt - General information ================================================================================= Section 3 ================================================================================= The definition of a time series, its relationship to and the interrelationship among series, data and mapping files is detailed below: A time series refers to a set of data observed over an extended period of time over consistent time intervals (i.e. monthly, quarterly, semi-annually, annually). BLS time series data are typically produced at monthly intervals and represent data ranging from a specific consumer item in a specific geographical area whose price is gathered monthly to a category of worker in a specific industry whose employment rate is being recorded monthly, etc. The FTP files are organized such that data users are provided with the following set of files to use in their efforts to interpret data files: a) a series file (only one series file per survey) b) mapping files c) data files The series file contains a set of codes which, together, compose a series identification code that serves to uniquely identify a single time series. Additionally, the series file also contains the following series-level information: a) the period and year corresponding to the first data observation b) the period and year corresponding to the most recent data observation. The mapping files are definition files that contain explanatory text descriptions that correspond to each of the various codes contained within each series identification code. The data file contains one line of data for each observation period pertaining to a specific time series. Each line contains a reference to the following: a) a series identification code b) year in which data is observed c) period for which data is observed (M13, Q05, and S03 indicate annual averages) d) value e) footnote code (if available) ================================================================================= Section 4 ================================================================================= File Structure and Format: The following represents the file format used to define si.series. Note that the Field Numbers are for reference only; they do not exist in the database. Data files are in ASCII text format. Data elements are separated by tabs; the first record of each file contains the column headers for the data elements stored in each field. Each record ends with a new line character. Field #/Data Element Length Value(Example) 1. series_id 17 SIU00000001 2. division_code 2 00 3. industry_code 4 0000 4. data_type_code 1 1 5. case_type_code 1 T 6. begin_year 4 1989 7. begin_period 3 A01 8. end_year 4 2000 9. end_period 3 A01 The series_id (SIU00000001) can be broken out into: Code Value survey abbreviation = SI seasonal (code) = U division_code = 00 industry_code = 0000 data_type_code = 0 case_type_code = 1 ================================================================================== Section 5 ================================================================================== File Structure and Format: The following represents the file format used to define each data file. Note that the field numbers are for reference only; they do not exist in the database. Data files are in ASCII text format. Data elements are separated by tabs; the first record of each file contains the column headers for the data elements stored in each field. Each record ends with a new line character. File Name: si.data.1.AllData The above-named data file has the following format: Field #/Data Element Length Value(Example) 1. series_id 17 SIU00000001 2. year 4 1989 3. period 3 A01 4. value 12 0.4 5. footnote_codes 10 It varies The series_id (SIU00000001) can be broken out into: Code Value survey abbreviation = SI seasonal (code) = U division_code = 00 industry_code = 0000 data_type_code = 0 case_type_code = 1 ================================================================================ Section 6 ================================================================================ File Structure and Format: The following represents the file format used to define each mapping file. Note that the field numbers are for reference only; they do not exist in the database. Mapping files are in ASCII text format. Data elements are separated by tabs; the first record of each file contains the column headers for the data elements stored in each field. Each record ends with a new line character. File Name: si.case.type Field #/Data Element Length Value(Example) 1. case_type_code 1 1 2. case_type_text 55 Text File Name: si.data.type Field #/Data Element Length Value(Example) 1. data_type_code 1 3 2. data_type_text 60 Text File Name: si.division Field #/Data Element Length Value(Example) 1. division_code 2 70 2. division_name 50 Text File Name: si.footnote Field #/Data Element Length Value(Example) 1. footnote_code 1 C 2. footnote_text 100 Text File Name: si.industry Field #/Data Element Length Value(Example) 1. division_code 2 09 2. industry_code 4 0180 3. industry_name 50 Text File Name: si.period Field #/Data Element Length Value(Example) 1. period 3 A01 2. period_abbr 5 ANN 3. period_name 20 Text ========================================================================================= Section 7 ========================================================================================= OCCUPATIONAL INJURIES AND ILLNESSES INCIDENCES RATE DATA (SI) DATABASE ELEMENTS Data Element Length Value(Example) Description begin_period 3 A01=Annual Identifies first observation of data series by frequency and period. begin_year 4 YYYY Identifies earliest year for Ex: 1985 which data series is available. case_type_code 1 Ex: T=Total recordable Code identifying type of cases cases of poisoning to which the incidence rate applies. case_type_text 55 Text Name identifying the type of Ex: Injury,illness cases to which the incidence rate refers. data_type_code 1 Ex: 1=Rate of injury Code identifying the data type cases per 100 to which the incidence rate full-time workers refers. data_type_text 60 Text Name identifying the data Ex: Lost workdays type to which the incidence rate refers. division_code 2 Ex: 10=Mining Code identifying the major industry division. division_name 50 Text Name of the major industry Ex: Services division. end_period 3 A01=Annual Identifies last observation of data series by frequency and period. end_year 4 YYYY Identifies latest year for Ex: 1990 which data are available. footnote_code 1 C Identifies footnote for the data series. footnote_codes 10 It varies Identifies footnotes for the data series. footnote_text 100 Text Contains the text of the footnote. industry_code 4 Ex:0000=Private SIC code identifying industry. industry industry_name 50 Text Name of industry to which data Ex: Mining pertain. period_abbr 5 Period name Abbreviation of period name. abbreviation Ex: ANN period 3 A01=Annual Identifies period for which data is observed. period_name 20 Text Full name of period to which the Ex: Annual data observation refers. rounded 1 N=Not rounded to 0 Code indicating if "0" value is Y=Rounded to 0 the result of rounding. series_id 17 Code series identifier Code identifying the specific series. Ex: SIU00000001 value 12 Data value Incidence rate. Ex: 0.4 year 4 YYYY Identifies year of observation. Ex: 1990