Occupational Asthma Reference

OBrien C, Bright P, Nicholson C, Burge PS, Patterns of peak expiratory flow response to upper respiratory tract infections in asthmatics, Eur Respir J, 1995;8 Suppl 19:272s,

Keywords: Oasys, respiratory infection, program

Known Authors

Sherwood Burge, Oasys Sherwood Burge

Phil Bright, Oasys Phil Bright

If you would like to become a known author and have your picture displayed along with your papers then please get in touch from the contact page. Known authors can choose to receive emails when their papers receive comments.

Abstract

Introduction
Virus infections are a common confounding factor when analysing serial peak flow records for occupational asthma. Here we try to define a method for automatically identifying upper respiratory tract infection patterns in serial peak flow records, without identifying similar patterns that are due to an occupational effect. The techniques are based on "Patterns of Peak Expiratory Flow Response to Upper Respiratory Tract Infection in Adults with Asthma" (1).

In (1) the first day of symptoms is known. The start of the pattern is the first daily mean below the 95% confidence interval of the baseline. The end of the infection pattern is defined by the first of three consecutive daily means within or above the 95% confidence interval of the baseline.




In (1) the baseline is made from the first 7 days of the records. If there is an infection in the first 7 days then the last 7 days are used. The baseline is the mean and standard deviation of the daily mean.
Duration and Decline
It is difficult to use time to nadir, time from nadir, total duration or maximal decline to differentiate between infection and work related patterns. Infection patterns show very large variation in these measures. There are trends, but setting sensible limits to these measures would also include virtually all work related patterns.



Methods
In (1) four types of infection pattern are identified, but by far the most common is Pattern A (Decline with Recovery). We will use half of these (41 records) for setup and the other half to evaluate. To look at the effect on records with occupational asthma the positive gold standard’s from (2), roughly half will be used for setup and the other’s for evaluation. These will be filtered so that all records are of adequate quality (3), are free from confounding factors and so that there are no two records from the same patient in each group. There are 29 control records in the setup group and 32 in the evaluation group.

In (1) only patterns with a significant deterioration are considered. A significant deterioration is defined as two or more consecutive daily means below the 95% confidence interval for the baseline. The end of a pattern is defined as the first of three consecutive daily means within baseline. In (1) the baseline is taken from the first 7 days of the record, unless it contains an infection, in which case the last 7 days are used.

Infection records have 2 readings a day every day. The occupational asthma control records have between 0 and 11 readings per day.

The following methods of identifying infections are tried
Baseline
Infections are defined as a certain number of consecutive days below the 95% confidence interval for the baseline. The start and end of the infection are defined as a certain number of consecutive days within or above the 95% confidence interval for the baseline. Baselines are calculated in a variety of ways to find the most effective.
What do we do for days with no readings?
Line Generalisation
A line graph is shown for the mean PEF for each day. Cartographic techniques for line generalisation are then used to simply the line. Any resulting dips in the line graph were said to be infections. “Perpendicular Distance Algorithm” and “Angular Tolerance Algorithm” are tried. A variety of angular and perpendicular tolerances are tried to find the most effective.
Peak / Trough Detection
(4) provides a mathematical definition of peaks and troughs in a physiological signal. Various tolerances are used to identify troughs that are due to infection.

Baseline Method
2 and 3 days within the 95% confidence interval are tried to define the start and end of an infection. 2 days below the 95% confidence interval is used to define the infection.

Common sense is used to decide if a calculated pattern matches a validated infection pattern. The identification of a pattern is considered more important than the exact duration of it. Generally if a calculated pattern and a validated pattern overlap they are considered a match. In some cases two patterns may be identified in the place of one validated infection pattern (where the pattern has a double dip). The first of two such patterns is said to identify the infection and the second is ignored (not regarded as a false positive).

Various methods to remove false positives are tried. Patterns with a maximal decline significantly less than the highest maximal decline for a record. Patterns with a maximal duration significantly less than the highest maximal duration for a record. Many patterns that correspond to the number of work periods and the workday baseline method (see below).

Percentage of the Highest Daily means
This technique calculates a 95% confidence interval on a certain percentage of the highest daily mean. If the record contains an infection the affected days will probably not be included in the baseline. The baseline will be higher than that for the whole record. 30 – 100% is tried
Percentage of the Median Daily means
This technique calculates a 95% confidence interval on a certain percentage of the median daily means. If the record contains an infection the affected days will probably not be included in the baseline. The baseline mean will probably be similar to the whole record mean. The baseline confidence interval will be lower than that for the whole record. 30 – 100% is tried
Workday
This technique uses a certain percentage range of the lowest daily mean PEFs for work days only to calculate the baseline. In occupational asthmatics this will probably be lower than the whole record baseline which may prevent patterns due to occupational asthma being identified as infections. In no occupational subjects this will be slightly lower than the whole record baseline. 35%-95%, 45%-95%, 55%-95%, 40%-100%, 50-100% and 60-100% are tried.
Line Generalisation

Two local processing routines are used to simplify a line graph of the daily mean PEF, the “Perpendicular Distance Algorithm” and the “Angular Tolerance Algorithm”. Local processing routines use a point’s neighbours to decide whether to retain it. These are some of the simplest line gerneralisation techniques, more complex algorithms could be studied in a follow up paper.



Perpendicular Distance Algorithm Angular Tolerance Algorithm
Peak / Trough Detection
A point in the line (i) is said to dominate point (j) if and only if i > (j + d) and the line between i and j is bounded by i above and j below, where d is the tolerance. A peak is a point in the line that dominates a preceding and a subsequent point. A trough is a point in the line that is dominated by a preceding and a subsequent point.

In order to avoid transitory one day patterns a line graph is created whereby each point relates to the rolling mean for ‘n’ days. For example the first point is the mean of days 1 – n. The second point is the mean of days 2 – n + 1 and so on. N values from 1 to 4 are tried. If any of the days that are used to make up the rolling mean have no readings then no point is created.

The tolerance is set to the 1.96 and 2.57 standard deviations of the baselines so that only exceptional peaks / troughs are detected.

The ((mean of all PEFs on workdays) – (mean of all PEFs on restdays)) is added to the tolerance, if this value is positive. This will probably increase the tolerance of occupational asthmatics which will prevent identification of patterns due to occupational asthma. This will have little or no effect on non occupational records.

Examples of peak . trough detection on neonatal chest movement. Vertical lines show which peaks and troughs are detected.
Maybe redo these pictures on serial peak flow data?

A high tolerance detects only the large peaks A smaller tolerance detects the smaller peaks as well

This technique detects the trough of an infection pattern, it does not attempt to define the start and end of the pattern. An infection is considered identified if the method finds a trough within the duration of the pattern as defined in (1). An infection false positive is defined as a trough that is not within a pattern defined in (1). A control false positive is any trough found in a control record. Peaks are ignored.

Solar data is used to find the infections.

- do we want to average all readings of average the mean for each day
- have done mean of daily means
- v1071 is a good example of this - it kind of gets the nadir in the wrong place

We are using solar data to calculate the infections so the wasif score will not be so good. How much do we worry about this? Need to discuss it in paper.
– potential problem when there are exclusions as a lot of the data could then be missing.
– Readings also created and removed and stuff which would be a little strange.
To show the start and end of the infection on the graph it is better to work out the start and end using the same interpretation as the graph is using.
Results
Baseline Method
Defining an infection as 2 days below the 95% confidence interval was much better at picking out infections. The Workday baseline was the best compromise between identifying infections and preventing false positives. Excluding patterns on maximal decline was the best method of removing further false positives and can be combined the workday baseline method. 60% was the best cut off. Maximal duration and many patterns were relatively poor at removing false positives.

This table shows the results of various baselines on the setup set. Infections are identified by 2 days below the 95% confidence interval. The pattern with the largest decline in a record is identified as an infection. Any other patterns with a maximal decline less than 60% of the largest maximal decline are excluded.

Baseline Infections (of 42) Infection False Positives Control False Positives (29 records)
Whole record Mean 41 10 41
All Work Days 42 10 25
35-95% of Work Days 41 9 18
45-95% of Work Days 38 9 14
55-95% of Work Days 36 8 11
40-100% of Work Days 38 7 13
50-100% of Work Days 36 9 11
60-100% of Work Days 35 7 8
Line Generalisation Method
This method did not perform well at any point and was dropped very early on.
Peak / Trough Detection
The table below shows the results of peak / trough detection on the setup set for various combinations of tolerance and number of days used to produce the rolling mean. In all cases the ((mean of all PEFs on workdays) – (mean of all PEFs on restdays)) is added to the tolerance, if this value is positive. The tolerance is the number of standard deviations stated of the baseline. The baseline is calculated from all of the daily mean PEFs. No patterns have been excluded on maximal decline as it seemed unnecessary.

Standard deviations Days Infections (of 42) Infection False Positive Control False Positive (29 records)
1.96 1 42 35 16
2.576 1 36 15 6
1.96 2 37 15 4
2.576 2 31 4 2
1.96 3 33 6 3
1.96 4 29 1 2

Evaluation
Peak / Trough detection seems to be the best method. It identifies less infections than the baseline method but for every infection missed two false positives are also eliminated.

The table below shows the results of peak / trough detection on the evaluation set for a tolerance of 1.96 standard deviations of the baseline of all the daily means + ((mean of all PEFs on workdays) – (mean of all PEFs on restdays)) (if this value is positive. A rolling 4 day mean is used to calculate the points in the signal that is used to detect peaks and troughs.
Standard deviations Days Identified (of 45) Infection False Positive Control False Positive (29 records)
1.96 4 32 2 1

Specificity 71%
Sensitivity 97% (based on control records)

Discussion
Won’t work if only one work period which has the infection in because the algorithm will think that is a occupational asthma pattern.

Start and end of pattern must be in record for it to be detected (so learning effects and laze effects will not be found).

Infection records have two readings a day every day. What do control records have – find out.

References

1. Patterns of Peak Expiratory Flow Response to Upper Respiratory Tract Infection in Adults with Asthma
2. The development of Oasys-2; a system for the analysis of serial measurement of peak expiratory flow in workers with suspected occupational asthma
3. Wasif’s Quality paper (publish it soon man!)
4. An algorithm for the detection of peaks and troughs in physiological signals, Bryan S. Todd, Oxford University Computing Laboratory.

Plain text: Introduction Virus infections are a common confounding factor when analysing serial peak flow records for occupational asthma. Here we try to define a method for automatically identifying upper respiratory tract infection patterns in serial peak flow records, without identifying similar patterns that are due to an occupational effect. The techniques are based on "Patterns of Peak Expiratory Flow Response to Upper Respiratory Tract Infection in Adults with Asthma" (1). In (1) the first day of symptoms is known. The start of the pattern is the first daily mean below the 95% confidence interval of the baseline. The end of the infection pattern is defined by the first of three consecutive daily means within or above the 95% confidence interval of the baseline. In (1) the baseline is made from the first 7 days of the records. If there is an infection in the first 7 days then the last 7 days are used. The baseline is the mean and standard deviation of the daily mean. Duration and Decline It is difficult to use time to nadir, time from nadir, total duration or maximal decline to differentiate between infection and work related patterns. Infection patterns show very large variation in these measures. There are trends, but setting sensible limits to these measures would also include virtually all work related patterns. Methods In (1) four types of infection pattern are identified, but by far the most common is Pattern A (Decline with Recovery). We will use half of these (41 records) for setup and the other half to evaluate. To look at the effect on records with occupational asthma the positive gold standard's from (2), roughly half will be used for setup and the other's for evaluation. These will be filtered so that all records are of adequate quality (3), are free from confounding factors and so that there are no two records from the same patient in each group. There are 29 control records in the setup group and 32 in the evaluation group. In (1) only patterns with a significant deterioration are considered. A significant deterioration is defined as two or more consecutive daily means below the 95% confidence interval for the baseline. The end of a pattern is defined as the first of three consecutive daily means within baseline. In (1) the baseline is taken from the first 7 days of the record, unless it contains an infection, in which case the last 7 days are used. Infection records have 2 readings a day every day. The occupational asthma control records have between 0 and 11 readings per day. The following methods of identifying infections are tried Baseline Infections are defined as a certain number of consecutive days below the 95% confidence interval for the baseline. The start and end of the infection are defined as a certain number of consecutive days within or above the 95% confidence interval for the baseline. Baselines are calculated in a variety of ways to find the most effective. What do we do for days with no readings? Line Generalisation A line graph is shown for the mean PEF for each day. Cartographic techniques for line generalisation are then used to simply the line. Any resulting dips in the line graph were said to be infections. "Perpendicular Distance Algorithm" and "Angular Tolerance Algorithm" are tried. A variety of angular and perpendicular tolerances are tried to find the most effective. Peak / Trough Detection (4) provides a mathematical definition of peaks and troughs in a physiological signal. Various tolerances are used to identify troughs that are due to infection. Baseline Method 2 and 3 days within the 95% confidence interval are tried to define the start and end of an infection. 2 days below the 95% confidence interval is used to define the infection. Common sense is used to decide if a calculated pattern matches a validated infection pattern. The identification of a pattern is considered more important than the exact duration of it. Generally if a calculated pattern and a validated pattern overlap they are considered a match. In some cases two patterns may be identified in the place of one validated infection pattern (where the pattern has a double dip). The first of two such patterns is said to identify the infection and the second is ignored (not regarded as a false positive). Various methods to remove false positives are tried. Patterns with a maximal decline significantly less than the highest maximal decline for a record. Patterns with a maximal duration significantly less than the highest maximal duration for a record. Many patterns that correspond to the number of work periods and the workday baseline method (see below). Percentage of the Highest Daily means This technique calculates a 95% confidence interval on a certain percentage of the highest daily mean. If the record contains an infection the affected days will probably not be included in the baseline. The baseline will be higher than that for the whole record. 30 - 100% is tried Percentage of the Median Daily means This technique calculates a 95% confidence interval on a certain percentage of the median daily means. If the record contains an infection the affected days will probably not be included in the baseline. The baseline mean will probably be similar to the whole record mean. The baseline confidence interval will be lower than that for the whole record. 30 - 100% is tried Workday This technique uses a certain percentage range of the lowest daily mean PEFs for work days only to calculate the baseline. In occupational asthmatics this will probably be lower than the whole record baseline which may prevent patterns due to occupational asthma being identified as infections. In no occupational subjects this will be slightly lower than the whole record baseline. 35%-95%, 45%-95%, 55%-95%, 40%-100%, 50-100% and 60-100% are tried. Line Generalisation Two local processing routines are used to simplify a line graph of the daily mean PEF, the "Perpendicular Distance Algorithm" and the "Angular Tolerance Algorithm". Local processing routines use a point's neighbours to decide whether to retain it. These are some of the simplest line gerneralisation techniques, more complex algorithms could be studied in a follow up paper. Perpendicular Distance Algorithm Angular Tolerance Algorithm Peak / Trough Detection A point in the line (i) is said to dominate point (j) if and only if i > (j + d) and the line between i and j is bounded by i above and j below, where d is the tolerance. A peak is a point in the line that dominates a preceding and a subsequent point. A trough is a point in the line that is dominated by a preceding and a subsequent point. In order to avoid transitory one day patterns a line graph is created whereby each point relates to the rolling mean for 'n' days. For example the first point is the mean of days 1 - n. The second point is the mean of days 2 - n + 1 and so on. N values from 1 to 4 are tried. If any of the days that are used to make up the rolling mean have no readings then no point is created. The tolerance is set to the 1.96 and 2.57 standard deviations of the baselines so that only exceptional peaks / troughs are detected. The ((mean of all PEFs on workdays) - (mean of all PEFs on restdays)) is added to the tolerance, if this value is positive. This will probably increase the tolerance of occupational asthmatics which will prevent identification of patterns due to occupational asthma. This will have little or no effect on non occupational records. Examples of peak . trough detection on neonatal chest movement. Vertical lines show which peaks and troughs are detected. Maybe redo these pictures on serial peak flow data? A high tolerance detects only the large peaks A smaller tolerance detects the smaller peaks as well This technique detects the trough of an infection pattern, it does not attempt to define the start and end of the pattern. An infection is considered identified if the method finds a trough within the duration of the pattern as defined in (1). An infection false positive is defined as a trough that is not within a pattern defined in (1). A control false positive is any trough found in a control record. Peaks are ignored. Solar data is used to find the infections. - do we want to average all readings of average the mean for each day - have done mean of daily means - v1071 is a good example of this - it kind of gets the nadir in the wrong place We are using solar data to calculate the infections so the wasif score will not be so good. How much do we worry about this? Need to discuss it in paper. - potential problem when there are exclusions as a lot of the data could then be missing. - Readings also created and removed and stuff which would be a little strange. To show the start and end of the infection on the graph it is better to work out the start and end using the same interpretation as the graph is using. Results Baseline Method Defining an infection as 2 days below the 95% confidence interval was much better at picking out infections. The Workday baseline was the best compromise between identifying infections and preventing false positives. Excluding patterns on maximal decline was the best method of removing further false positives and can be combined the workday baseline method. 60% was the best cut off. Maximal duration and many patterns were relatively poor at removing false positives. This table shows the results of various baselines on the setup set. Infections are identified by 2 days below the 95% confidence interval. The pattern with the largest decline in a record is identified as an infection. Any other patterns with a maximal decline less than 60% of the largest maximal decline are excluded. Baseline Infections (of 42) Infection False Positives Control False Positives (29 records) Whole record Mean 41 10 41 All Work Days 42 10 25 35-95% of Work Days 41 9 18 45-95% of Work Days 38 9 14 55-95% of Work Days 36 8 11 40-100% of Work Days 38 7 13 50-100% of Work Days 36 9 11 60-100% of Work Days 35 7 8 Line Generalisation Method This method did not perform well at any point and was dropped very early on. Peak / Trough Detection The table below shows the results of peak / trough detection on the setup set for various combinations of tolerance and number of days used to produce the rolling mean. In all cases the ((mean of all PEFs on workdays) - (mean of all PEFs on restdays)) is added to the tolerance, if this value is positive. The tolerance is the number of standard deviations stated of the baseline. The baseline is calculated from all of the daily mean PEFs. No patterns have been excluded on maximal decline as it seemed unnecessary. Standard deviations Days Infections (of 42) Infection False Positive Control False Positive (29 records) 1.96 1 42 35 16 2.576 1 36 15 6 1.96 2 37 15 4 2.576 2 31 4 2 1.96 3 33 6 3 1.96 4 29 1 2 Evaluation Peak / Trough detection seems to be the best method. It identifies less infections than the baseline method but for every infection missed two false positives are also eliminated. The table below shows the results of peak / trough detection on the evaluation set for a tolerance of 1.96 standard deviations of the baseline of all the daily means + ((mean of all PEFs on workdays) - (mean of all PEFs on restdays)) (if this value is positive. A rolling 4 day mean is used to calculate the points in the signal that is used to detect peaks and troughs. Standard deviations Days Identified (of 45) Infection False Positive Control False Positive (29 records) 1.96 4 32 2 1 Specificity 71% Sensitivity 97% (based on control records) Discussion Won't work if only one work period which has the infection in because the algorithm will think that is a occupational asthma pattern. Start and end of pattern must be in record for it to be detected (so learning effects and laze effects will not be found). Infection records have two readings a day every day. What do control records have - find out. References 1. Patterns of Peak Expiratory Flow Response to Upper Respiratory Tract Infection in Adults with Asthma 2. The development of Oasys-2; a system for the analysis of serial measurement of peak expiratory flow in workers with suspected occupational asthma 3. Wasif's Quality paper (publish it soon man!) 4. An algorithm for the detection of peaks and troughs in physiological signals, Bryan S. Todd, Oxford University Computing Laboratory.

Full Text

Full text of this reference not available

Please Log In or Register to add the full text to this reference

Associated Questions

There are no associations for this paper.

Please Log In or Register to put forward this reference as evidence to a question.

Comments

Please sign in or register to add your thoughts.


Oasys and occupational asthma smoke logo