Replication Data for: Investigating Use of Low-Cost Sensors to Increase Accuracy and Equity of Real-Time Air Quality Information (doi:10.7910/DVN/QR4N7V)

View:

Part 1: Document Description
Part 2: Study Description
Part 3: Data Files Description
Part 4: Variable Description
Part 5: Other Study-Related Materials
Entire Codebook

(external link)

Document Description

Citation

Title:

Replication Data for: Investigating Use of Low-Cost Sensors to Increase Accuracy and Equity of Real-Time Air Quality Information

Identification Number:

doi:10.7910/DVN/QR4N7V

Distributor:

Harvard Dataverse

Date of Distribution:

2022-12-08

Version:

1

Bibliographic Citation:

Considine, Ellen, 2022, "Replication Data for: Investigating Use of Low-Cost Sensors to Increase Accuracy and Equity of Real-Time Air Quality Information", https://doi.org/10.7910/DVN/QR4N7V, Harvard Dataverse, V1, UNF:6:J+iWb1dGAgHgWN62soaKLA== [fileUNF]

Study Description

Citation

Title:

Replication Data for: Investigating Use of Low-Cost Sensors to Increase Accuracy and Equity of Real-Time Air Quality Information

Identification Number:

doi:10.7910/DVN/QR4N7V

Authoring Entity:

Considine, Ellen (Harvard University)

Other identifications and acknowledgements:

Nethery, Rachel

Other identifications and acknowledgements:

deSouza, Priyanka

Other identifications and acknowledgements:

Braun, Danielle

Other identifications and acknowledgements:

Kamareddine, Leila

Grant Number:

5T32ES007142

Grant Number:

1K01ES032458

Distributor:

Harvard Dataverse

Access Authority:

Considine, Ellen

Depositor:

Considine, Ellen

Date of Deposit:

2022-12-08

Holdings Information:

https://doi.org/10.7910/DVN/QR4N7V

Study Scope

Keywords:

Computer and Information Science, Earth and Environmental Sciences, Mathematical Sciences, Social Sciences, air quality, low-cost sensors, environmental justice, information access, decision making, simulations

Abstract:

This analytic dataset contains various environmental and socio-demographic characteristics of the state of California, daily at the resolution of 1km x 1km. All original sources of this data are open access; we share this processed dataset to facilitate replication of our paper and other exploration.

Notes:

Code for this analysis (including notes on data procurement and data processing as well as generation of figures and tables for the paper) can be found here: https://github.com/EllenConsidine/LCS_placement_sims

Methodology and Processing

Sources Statement

Data Sources:

See references in paper

Data Access

Notes:

<a href="http://creativecommons.org/publicdomain/zero/1.0">CC0 1.0</a>

Other Study Description Materials

Related Publications

Citation

Title:

Considine EM, Braun D, Kamareddine L, Nethery RC, and deSouza P. Investigating Use of Low-Cost Sensors to Increase Accuracy and Equity of Real-Time Air Quality Information. Environmental Science & Technology.

Bibliographic Citation:

Considine EM, Braun D, Kamareddine L, Nethery RC, and deSouza P. Investigating Use of Low-Cost Sensors to Increase Accuracy and Equity of Real-Time Air Quality Information. Environmental Science & Technology.

File Description--f6793063

File: Daily_AQS_PA_2020.tab

  • Number of cases: 1409

  • No. of variables per record: 9

  • Type of File: text/tab-separated-values

Notes:

UNF:6:QFu+R1vf8hCLNaL6DyZwww==

Dataframe of 24h-avg. measurements from PurpleAir LCS located within 50 meters of EPA (AQS) monitors

File Description--f6793062

File: PA_outside.tab

  • Number of cases: 12851

  • No. of variables per record: 44

  • Type of File: text/tab-separated-values

Notes:

UNF:6:xR+XaZ3qdLwX5mmz1q6AKw==

Locations and IDs of outdoor PurpleAir LCS

Variable Description

List of Variables:

Variables

Lon

f6793063 Location:

Summary Statistics: Valid 1409.0; Min. -124.17949; StDev 1.5568078358320925; Mean -119.6888523484741; Max. -116.86147

Variable Format: numeric

Notes: UNF:6:FFsUCOfVfLa4+KblFu/9hw==

Lat

f6793063 Location:

Summary Statistics: Min. 33.859662; Max. 40.77678; Mean 36.198941693399576; StDev 2.006918766155282; Valid 1409.0

Variable Format: numeric

Notes: UNF:6:P2i9dpGUZmWdp8j99YpEXA==

Date

f6793063 Location:

Variable Format: character

Notes: UNF:6:VqEwjmGk16cGUtyv2+hhHg==

PM2.5

f6793063 Location:

Summary Statistics: Min. 0.0; Max. 202.2041665; Valid 1409.0; StDev 11.153444934808967; Mean 7.929568774473623

Variable Format: numeric

Notes: UNF:6:Bo9BN/cOt7o1yXtzv/rVOA==

PA.ID

f6793063 Location:

Summary Statistics: Valid 1409.0; Min. 1854.0; Mean 13542.491838183136; StDev 11469.555146315502; Max. 55707.0;

Variable Format: numeric

Notes: UNF:6:q5nb8Qd4PtpTJptLttFckg==

Dist

f6793063 Location:

Summary Statistics: StDev 0.09000277046670038; Valid 1409.0; Max. 0.429595795479375; Min. 0.0; Mean 0.04774514229637395;

Variable Format: numeric

Notes: UNF:6:FL9HcHkaBqYVLrZc8O9X8w==

PA_PM25

f6793063 Location:

Summary Statistics: Min. 0.0; Max. 5084.62981595092; Valid 1409.0; StDev 185.73254210975716; Mean 21.26625649925973

Variable Format: numeric

Notes: UNF:6:CAsgfUUAkZOuZh3WoZNKBw==

Temp

f6793063 Location:

Summary Statistics: Max. 104.0; StDev 12.068176781740556; Mean 67.93235405478374; Valid 1409.0; Min. 31.5222222222222;

Variable Format: numeric

Notes: UNF:6:8G3mEmjyQJuZVguxFmo3Jg==

RH

f6793063 Location:

Summary Statistics: Mean 44.78844859425839; Valid 1409.0; Max. 100.0; Min. 7.64861111111111; StDev 15.349379024397724

Variable Format: numeric

Notes: UNF:6:Z27mNlvFxYw0S81pYZ+OYw==

id

f6793062 Location:

Summary Statistics: Max. 111191.0; StDev 30352.837349578214; Mean 53962.334293051186; Min. 179.0; Valid 12851.0

Variable Format: numeric

Notes: UNF:6:SBiIVi7CX1ZWw3tJit+MRA==

10min_avg

f6793062 Location:

Summary Statistics: Mean 12.589349890999687; Valid 12844.0; Max. 1560.52; Min. 0.0; StDev 49.11465917149697

Variable Format: numeric

Notes: UNF:6:yRxgm5wJEHIFU+bmXcENyg==

1day_avg

f6793062 Location:

Summary Statistics: StDev 14.297630131751934; Mean 9.03677359078169; Valid 12844.0; Max. 399.24; Min. 0.0;

Variable Format: numeric

Notes: UNF:6:7a3V7Ue4WMCa8ZlGb8gB9A==

1hour_avg

f6793062 Location:

Summary Statistics: Max. 1094.23; Valid 12844.0; Mean 12.017312363749612; StDev 31.825227029552664; Min. 0.0

Variable Format: numeric

Notes: UNF:6:p3nggmmu8cUkOn+i6ykr/Q==

1week_avg

f6793062 Location:

Summary Statistics: Max. 326.9; Min. 0.0; Mean 7.4833704453441285; Valid 12844.0; StDev 9.624684340607995

Variable Format: numeric

Notes: UNF:6:NpG6nKQmZ24e42xddt/bfQ==

30min_avg

f6793062 Location:

Summary Statistics: Max. 1212.4; StDev 37.76620244792041; Valid 12844.0; Mean 12.351128153223295; Min. 0.0

Variable Format: numeric

Notes: UNF:6:hEqEZ2nFE4Cw+ngKVUeEtQ==

6hour_avg

f6793062 Location:

Summary Statistics: Mean 10.659417626907501; Max. 595.0; StDev 20.32493902464433; Min. 0.0; Valid 12844.0

Variable Format: numeric

Notes: UNF:6:4NJ1y6TxMixehKXJ0VZNDw==

adc

f6793062 Location:

Summary Statistics: Mean NaN; StDev NaN; Max. NaN; Valid 0.0; Min. NaN

Variable Format: numeric

Notes: UNF:6:+biI7cSZFs0dEJk1+A+nww==

age

f6793062 Location:

Summary Statistics: Max. 1426739.0; Mean 2701.9088786865423; Valid 12851.0; StDev 25332.31298399713; Min. 0.0

Variable Format: numeric

Notes: UNF:6:zq+rxsz+nvCT7kIFcYYafg==

brightness

f6793062 Location:

Summary Statistics: StDev NaN; Max. NaN; Mean NaN; Min. NaN; Valid 0.0

Variable Format: numeric

Notes: UNF:6:+biI7cSZFs0dEJk1+A+nww==

created

f6793062 Location:

Summary Statistics: StDev NaN; Max. NaN; Valid 0.0; Min. NaN; Mean NaN

Variable Format: numeric

Notes: UNF:6:+biI7cSZFs0dEJk1+A+nww==

downgraded

f6793062 Location:

Variable Format: character

Notes: UNF:6:Y+2UcnFWhXhv21kxjaTxEQ==

flagged

f6793062 Location:

Variable Format: character

Notes: UNF:6:bGAeL102ggJhm3Ei4VT0Nw==

hardware

f6793062 Location:

Summary Statistics: Valid 0.0; Mean NaN; Max. NaN; StDev NaN; Min. NaN;

Variable Format: numeric

Notes: UNF:6:+biI7cSZFs0dEJk1+A+nww==

hidden

f6793062 Location:

Variable Format: character

Notes: UNF:6:mtBTf+4kgbpArSDCRemo9A==

humidity

f6793062 Location:

Summary Statistics: StDev 17.06977999626973; Mean 44.44301559234852; Min. 0.0; Valid 12442.0; Max. 100.0;

Variable Format: numeric

Notes: UNF:6:C2PKen8ZzI7mnhHZnppgqw==

is_owner

f6793062 Location:

Variable Format: character

Notes: UNF:6:mtBTf+4kgbpArSDCRemo9A==

last_seen

f6793062 Location:

Variable Format: character

Notes: UNF:6:7scXcRpOs64jGQd/T9Y0Xw==

last_update_check

f6793062 Location:

Summary Statistics: Mean NaN; Max. NaN; Min. NaN; Valid 0.0; StDev NaN;

Variable Format: numeric

Notes: UNF:6:+biI7cSZFs0dEJk1+A+nww==

lat

f6793062 Location:

Summary Statistics: Max. 71.301356; Mean 36.836075891627544; Min. -44.852238; Valid 12780.0; StDev 12.531345998922369;

Variable Format: numeric

Notes: UNF:6:3T1TTsTINr2UNVQPDG9IRw==

location_type

f6793062 Location:

Variable Format: character

Notes: UNF:6:sK4wxECP+kInORLC3uZ3/Q==

lon

f6793062 Location:

Summary Statistics: Mean -96.1649529054773; Min. -176.5285; Max. 178.030934; Valid 12780.0; StDev 56.60794141299921

Variable Format: numeric

Notes: UNF:6:slaBG4+mAekGF09aMv4pgQ==

model

f6793062 Location:

Variable Format: character

Notes: UNF:6:N/CMkPK6MPGfax3YhncAFA==

name

f6793062 Location:

Variable Format: character

Notes: UNF:6:KVjIA3yO8T8BVr+yuCK6mw==

p_0_3_um

f6793062 Location:

Summary Statistics: Max. NaN; Valid 0.0; Mean NaN; StDev NaN; Min. NaN

Variable Format: numeric

Notes: UNF:6:+biI7cSZFs0dEJk1+A+nww==

p_0_5_um

f6793062 Location:

Summary Statistics: Min. NaN; Max. NaN; Valid 0.0; Mean NaN; StDev NaN

Variable Format: numeric

Notes: UNF:6:+biI7cSZFs0dEJk1+A+nww==

p_10_0_um

f6793062 Location:

Summary Statistics: Valid 0.0; Min. NaN; StDev NaN; Mean NaN; Max. NaN;

Variable Format: numeric

Notes: UNF:6:+biI7cSZFs0dEJk1+A+nww==

p_1_0_um

f6793062 Location:

Summary Statistics: Mean NaN; Valid 0.0; Max. NaN; Min. NaN; StDev NaN

Variable Format: numeric

Notes: UNF:6:+biI7cSZFs0dEJk1+A+nww==

p_2_5_um

f6793062 Location:

Summary Statistics: Mean NaN; Min. NaN; Max. NaN; StDev NaN; Valid 0.0

Variable Format: numeric

Notes: UNF:6:+biI7cSZFs0dEJk1+A+nww==

p_5_0_um

f6793062 Location:

Summary Statistics: Mean NaN; Max. NaN; Min. NaN; StDev NaN; Valid 0.0

Variable Format: numeric

Notes: UNF:6:+biI7cSZFs0dEJk1+A+nww==

parent

f6793062 Location:

Summary Statistics: Valid 0.0; Min. NaN; StDev NaN; Mean NaN; Max. NaN

Variable Format: numeric

Notes: UNF:6:+biI7cSZFs0dEJk1+A+nww==

pm10_0_atm

f6793062 Location:

Summary Statistics: StDev NaN; Min. NaN; Valid 0.0; Mean NaN; Max. NaN

Variable Format: numeric

Notes: UNF:6:+biI7cSZFs0dEJk1+A+nww==

pm10_0_cf_1

f6793062 Location:

Summary Statistics: StDev NaN; Min. NaN; Valid 0.0; Mean NaN; Max. NaN;

Variable Format: numeric

Notes: UNF:6:+biI7cSZFs0dEJk1+A+nww==

pm1_0_atm

f6793062 Location:

Summary Statistics: Mean NaN; Max. NaN; Min. NaN; Valid 0.0; StDev NaN

Variable Format: numeric

Notes: UNF:6:+biI7cSZFs0dEJk1+A+nww==

pm1_0_cf_1

f6793062 Location:

Summary Statistics: StDev NaN; Mean NaN; Valid 0.0; Max. NaN; Min. NaN

Variable Format: numeric

Notes: UNF:6:+biI7cSZFs0dEJk1+A+nww==

pm2_5_atm

f6793062 Location:

Summary Statistics: Max. NaN; StDev NaN; Mean NaN; Valid 0.0; Min. NaN

Variable Format: numeric

Notes: UNF:6:+biI7cSZFs0dEJk1+A+nww==

pm2_5_cf_1

f6793062 Location:

Summary Statistics: Valid 0.0; Max. NaN; Min. NaN; Mean NaN; StDev NaN;

Variable Format: numeric

Notes: UNF:6:+biI7cSZFs0dEJk1+A+nww==

pm_2.5

f6793062 Location:

Summary Statistics: Mean 28.020495211399144; Valid 12843.0; StDev 249.14573654787225; Min. 0.0; Max. 6666.0

Variable Format: numeric

Notes: UNF:6:pN9BYVNd4mFnt55kYqOh9g==

pressure

f6793062 Location:

Summary Statistics: Valid 12417.0; Min. 582.82; Max. 1038.49; StDev 64.49013161776594; Mean 969.9459853426754

Variable Format: numeric

Notes: UNF:6:m+qCEyavbi9KVFh5Lm14qw==

rssi

f6793062 Location:

Summary Statistics: Mean NaN; Max. NaN; Min. NaN; StDev NaN; Valid 0.0;

Variable Format: numeric

Notes: UNF:6:+biI7cSZFs0dEJk1+A+nww==

temp_c

f6793062 Location:

Summary Statistics: Max. 191.66666666666669; Mean 27.729509367911554; StDev 7.0287046333448355; Min. 0.0; Valid 12442.0

Variable Format: numeric

Notes: UNF:6:CPiCY2OAXkcIqc2sEcidZA==

temp_f

f6793062 Location:

Summary Statistics: StDev 12.65166834002076; Valid 12442.0; Min. 32.0; Max. 377.0; Mean 81.9131168622408

Variable Format: numeric

Notes: UNF:6:Ch7xGeT46ziIkaeYHA/sUw==

uptime

f6793062 Location:

Summary Statistics: Max. NaN; StDev NaN; Mean NaN; Valid 0.0; Min. NaN

Variable Format: numeric

Notes: UNF:6:+biI7cSZFs0dEJk1+A+nww==

version

f6793062 Location:

Summary Statistics: Max. NaN; Min. NaN; StDev NaN; Valid 0.0; Mean NaN

Variable Format: numeric

Notes: UNF:6:+biI7cSZFs0dEJk1+A+nww==

Other Study-Related Materials

Label:

CA_NA_pos.rds

Text:

Vector of location-days where the Di et al. daily estimates are missing

Notes:

application/gzip

Other Study-Related Materials

Label:

CA_with_CES_projected.rds

Text:

Dataframe of static characteristics (environmental, socio-demographic) of 1km x 1km grid points across California. First 38 columns are the exact same as CA_clean_projected.rds (which is referenced in several scripts).

Notes:

application/gzip

Other Study-Related Materials

Label:

Daily_PM25_CA.rds

Text:

Di et al. daily PM2.5 estimates (1km x 1km) extracted for just California in 2016 (original dataset is national, 2000-2016)

Notes:

application/gzip

Other Study-Related Materials

Label:

Daily_PM_AQI-class.rds

Text:

Vector of AQI classifications of daily PM2.5 (from the Di et al. estimates)

Notes:

application/gzip

Other Study-Related Materials

Label:

Real_deciles_updated_1-24-2022.rds

Text:

Vector of deciles of daily PM2.5 (from the Di et al. estimates), used for empirical LCS measurement error sampling

Notes:

application/gzip