Occupation Injuries and Illness Incidences Rate Data (SI)
				   si.txt


Section Listing

1. Survey Definition
2. FTP files listed in the survey directory.
3. Time series, series file, data file, & mapping file definitions and relationships
4. Series file format and field definitions
5. Data file format and field definitions
6. Mapping file formats and field definitions
7. Data Element Dictionary

================================================================================
Section 1
================================================================================

The following is a definition of:  OCCUPATIONAL INJURIES AND ILLNESSES INCIDENCES 
				   RATE DATA (SI)

Survey Description:  The occupational injury and illness incidence rate is
an annual measure of incidence of work-related injuries and illnesses and 
is in the form of the number of injuries and illnesses, or lost workdays 
per 100 full-time employees.  For this purpose, 200,000 employee hours 
represent 100 employee years.

Data collected through the annual survey are based on records which 
employers maintain under the Occupational Safety and Health Act.  The data 
include all cases resulting from work accidents or exposure in the work 
environment which result in death, nonfatal illness, or nonfatal injury 
which involves medical treatment (beyond first aid), loss of consciousness, 
restriction of work or motion, or transfer to another job.

Virtually the entire private sector is covered by the survey, except 
self-employed individuals, farms with fewer than 11 employees, and employers 
regulated by other Federal safety and health laws.  Federal, State, and 
local government agencies also are excluded.  Data conforming to definitions of recordable 
occupational injuries and illnesses for coal, metal and nonmetal mining, and
railroad transportation are provided by the Mine Safety and Health 
Administration, U.S. Department of Labor, and the Federal Railroad 
Administration, U.S. Department of Transportation.

The incidence rates are produced for industries which are based on the 1987 
Standard Industrial Classification (SIC).  The survey sample design uses 
stratified random sample with a Neyman allocation.  The characteristics used
to stratify the units are the States, SIC code, and employment.  The 
sampling ratios at the various employment size classes range from all units 
above a certain size class selected with certainty through declining 
proportions in each smaller employment-size class.  The data for all 
reporting units in each industry are expanded by the inverse of the sampling
ratio, and benchmarked to the appropriate employment level in each industry.  
Reports are collected from about 180,000 sample units.

Summary Data Available:  The incidence rates and numbers of cases are calculated
for three categories:  (1)  injury and illness combined, (2)  injury only; and 
(3)  illness only.

The incidence rates and numbers of cases for each category are available at the 
2-digit SIC industry level in agriculture, forestry, and fishing, the 3-digit level in 
oil and gas extraction, construction, transportation and public utilities, 
wholesale and retail trade, and finance, insurance, and real estate, and 
services; and the 4-digit level in manufacturing.

Estimates of incidence rates and numbers of cases for industries are also made for various 
severity classifications.  For each industry, the incidence rates and numbers of cases are 
available for total nonfatal recordable cases; cases with days away 
from work, job transfer or restriction; cases with days away from work; cases 
with job transfer or restriction only; and other recordable cases. 

Additionally, estimates of incidence rates and numbers of cases for industries are made 
for various illness categories:  skin diseases or disorders, respiratory conditions, poisonings, 
and all other illnesses. 

The estimating procedure generates occupational injury and illness estimates 
for approximately 835 SIC codes.  This dataset, however, excludes estimates 
for several industry codes if one of the following situations occurred:

   1.	Data were suppressed to protect the confidentiality of respondents.

   2.	Annual average employment for the industry was fewer than 10,000.  
	However, estimates for an industry with an annual average employment
	of less than 10,000 were published if the majority of the employment
	was reported in the survey.

   3.	The relative standard error for the estimate was above a limit based on 
	the number of cases.

   4.	The benchmark factor for the industry was less than 0.90 or greater
	than 1.49.

Data for an unpublished industry were included in the total for the broader 
industry level of which it is a part.


Calculation of Incidence Rates:

   1.	The incidence rates represent the number of injuries and/or 
	illnesses or lost workdays per 100 full-time workers and were 
	calculated as:  (N/EH) x 200,000 (20,000,000 for illness rates).

	where:

	N = number of injuries and/or illnesses or lost workdays.

	EH = total hours worked by all employees during the calendar year.

	     200,000 = base for 100 full-time equivalent workers 
	     (working 40 hours/week, 50 weeks/year). 
	
	     20,000,000 =base for 10,000 full-time equivalent workers
	     (working 40 hours/week, 50 weeks/year).

   2.	Average lost workdays are calculated as:

	Total lost workdays/total lost workday cases.

Frequency of Observations:  All data are annual.

Data Characteristics:  Rates are stored to one decimal place.  Number of cases are
		       in thousands.


References:	BLS Handbook of Methods, Chapter 9, "Occupational Safety 
		and Health Statistics", 1997 edition.


==================================================================================
Section 2
==================================================================================
The following Occupation Injuries and Illness Incidences Rate Data files are on the 
BLS internet in the sub-directory pub/time.series/si:

	si.case.type		- Case type codes		mapping file
	si.contacts		- Contacts for si survey 
	si.data.1.AllData	- All data
	si.data.type		- Data type codes		mapping file
	si.division		- Division codes		mapping file
	si.footnote		- Footnote codes		mapping file
	si.industry		- Industry codes		mapping file
	si.period		- Period codes			mapping file
	si.series		- All series and their beginning and end dates
	si.txt			- General information
=================================================================================
Section 3
=================================================================================
The definition of a time series, its relationship to and the interrelationship
among series, data and mapping files is detailed below:

A time series refers to a set of data observed over an extended period of time
over consistent time intervals (i.e. monthly, quarterly, semi-annually, annually).  
BLS time series data are typically produced at monthly intervals and represent data 
ranging from a specific consumer item in a specific geographical area whose price 
is gathered monthly to a category of worker in a specific industry whose employment
rate is being recorded monthly, etc.

The FTP files are organized such that data users are provided with the following
set of files to use in their efforts to interpret data files:

a)  a series file (only one series file per survey)
b)  mapping files
c)  data files

The series file contains a set of codes which, together, compose a series 
identification code that serves to uniquely identify a single time series.  
Additionally, the series file also contains the following series-level information:

a) the period and year corresponding to the first data observation 
b) the period and year corresponding to the most recent data observation. 

The mapping files are definition files that contain explanatory text descriptions
that correspond to each of the various codes contained within each series
identification code.

The data file contains one line of data for each observation period pertaining to a
specific time series.  Each line contains a reference to the following:

a) a series identification code
b) year in which data is observed
c) period for which data is observed (M13, Q05, and S03 indicate annual averages)
d) value
e) footnote code (if available)
=================================================================================
Section 4
=================================================================================
File Structure and Format: The following represents the file format used to define 
si.series.  Note that the Field Numbers are for reference only; they do not exist 
in the database.  Data files are in ASCII text format.  Data elements are separated 
by tabs; the first record of each file contains the column headers for the data
elements stored in each field.  Each record ends with a new line character. 

Field #/Data Element	Length		Value(Example)		

1.  series_id		  17		SIU00000001

2.  division_code	  2		00

3.  industry_code	  4		0000

4.  data_type_code	  1		1

5.  case_type_code	  1		T 

6.  begin_year		  4		1989

7.  begin_period	  3		A01

8.  end_year		  4		2000

9.  end_period		  3		A01


The series_id (SIU00000001) can be broken out into:

Code					Value

survey abbreviation	=		SI
seasonal (code) 	=		U
division_code		=		00
industry_code		=		0000
data_type_code		=		0
case_type_code		=		1
==================================================================================
Section 5
==================================================================================
File Structure and Format: The following represents the file format used to define
each data file.  Note that the field numbers are for reference only; they do not 
exist in the database.  Data files are in ASCII text format.  Data elements are 
separated by tabs; the first record of each file contains the column headers for 
the data elements stored in each field.  Each record ends with a new line character. 

File Name:  si.data.1.AllData

The above-named data file has the following format:

Field #/Data Element	Length		Value(Example)		

1. series_id		  17		SIU00000001      	      	 	      	

2. year			   4		1989	

3. period		   3		A01		

4. value		  12      	0.4	
				 
5. footnote_codes	  10		It varies
				
The series_id (SIU00000001) can be broken out into:

Code					Value

survey abbreviation	=		SI
seasonal (code) 	=		U
division_code		=		00
industry_code		=		0000
data_type_code		=		0
case_type_code		=		1
================================================================================
Section 6
================================================================================
File Structure and Format:  The following represents the file format used to define
each mapping file. Note that the field numbers are for reference only; they do not
exist in the database.  Mapping files are in ASCII text format.  Data elements are
separated by tabs; the first record of each file contains the column headers for the
data elements stored in each field.  Each record ends with a new line character. 

File Name:  si.case.type

Field #/Data Element		Length		Value(Example)

1.  case_type_code		1		1
	
2.  case_type_text		55		Text


File Name:  si.data.type

Field #/Data Element		Length		Value(Example)

1.  data_type_code		1		3

2.  data_type_text		60		Text


File Name:  si.division

Field #/Data Element		Length		Value(Example)

1.  division_code		2		70

2.  division_name		50		Text


File Name:  si.footnote

Field #/Data Element		Length		Value(Example)

1. footnote_code		1		C

2. footnote_text		100		Text


File Name:  si.industry

Field #/Data Element		Length		Value(Example)

1.  division_code		2		09

2.  industry_code		4		0180

3.  industry_name		50		Text


File Name:  si.period

Field #/Data Element		Length		Value(Example)

1.  period			3		A01

2.  period_abbr			5		ANN

3.  period_name			20		Text
=========================================================================================
Section 7
=========================================================================================
OCCUPATIONAL INJURIES AND ILLNESSES INCIDENCES RATE DATA (SI) 
DATABASE ELEMENTS


Data Element	Length		Value(Example)			Description

begin_period	3		A01=Annual		Identifies first observation
							of data series by frequency 
							and period.

begin_year	4		YYYY			Identifies earliest year for
				Ex: 1985		which data series is available.
						
case_type_code	1		Ex: T=Total recordable  Code identifying type of cases 
				cases of poisoning	to which the incidence rate	
							applies. 
							
case_type_text	55		Text			Name identifying the type of
				Ex: Injury,illness	cases to which the incidence
							rate refers.
				
data_type_code	1		Ex: 1=Rate of injury 	Code identifying the data type 
				cases per 100 		to which the incidence rate 
				full-time workers	refers.

data_type_text	60		Text 			Name identifying the data 
				Ex: Lost workdays	type to which the incidence
				 			rate refers.
				    	

division_code	2		Ex: 10=Mining			Code identifying the major 
							industry division.

division_name	50		Text			Name of the major industry 
				Ex: Services		division.
				
end_period	3		A01=Annual		Identifies last observation 
							of data series by frequency 
							and period.

end_year	4		YYYY			Identifies latest year for
				Ex: 1990		which data are available.

footnote_code	1		C			Identifies footnote for the data 
							series.

footnote_codes	10		It varies		Identifies footnotes for the data 
							series.	
							
footnote_text	100		Text			Contains the text of the footnote.

industry_code	4		Ex:0000=Private 	SIC code identifying industry.	
				industry
				
industry_name	50		Text			Name of industry to which data 
				Ex: Mining		pertain.
				
period_abbr	5		Period name		Abbreviation of period name.
				abbreviation
				Ex: ANN

period		3		A01=Annual		Identifies period for which 
							data is observed.
							
period_name	20		Text			Full name of period to which the  
				Ex: Annual		data observation refers.

rounded		1		N=Not rounded to 0	Code indicating if "0" value is 
				Y=Rounded to 0		the result of rounding.
					
series_id	17		Code series identifier 	Code identifying the specific series.
				Ex: SIU00000001
				
value		12		Data value		Incidence rate.
	   			Ex: 0.4

year		4		YYYY			Identifies year of observation. 
				Ex: 1990