An Available Water Capacity Pedotransfer Function using Random Forest - 2020 Cornell Soil Health Model (doi:10.7910/DVN/U5DAEP)

View:

Part 1: Document Description
Part 2: Study Description
Part 3: Data Files Description
Part 4: Variable Description
Part 5: Other Study-Related Materials
Entire Codebook

(external link)

Document Description

Citation

Title:

An Available Water Capacity Pedotransfer Function using Random Forest - 2020 Cornell Soil Health Model

Identification Number:

doi:10.7910/DVN/U5DAEP

Distributor:

Harvard Dataverse

Date of Distribution:

2022-01-31

Version:

2

Bibliographic Citation:

Amsili, Joseph; Harold van Es; Robert Schindelbeck, 2022, "An Available Water Capacity Pedotransfer Function using Random Forest - 2020 Cornell Soil Health Model", https://doi.org/10.7910/DVN/U5DAEP, Harvard Dataverse, V2, UNF:6:165Gfin1rPR9XGn56TIr9A== [fileUNF]

Study Description

Citation

Title:

An Available Water Capacity Pedotransfer Function using Random Forest - 2020 Cornell Soil Health Model

Identification Number:

doi:10.7910/DVN/U5DAEP

Authoring Entity:

Amsili, Joseph (Cornell University)

Harold van Es (Cornell University)

Robert Schindelbeck (Cornell University)

Distributor:

Harvard Dataverse

Access Authority:

Amsili, Joseph

Depositor:

Amsili, Joseph

Date of Deposit:

2022-01-30

Holdings Information:

https://doi.org/10.7910/DVN/U5DAEP

Study Scope

Keywords:

Agricultural Sciences, AWC, Available Water Capacity, Random Forest, Soil Health Indicator, Field Capacity, Permanent Wilting Point

Abstract:

In late 2018, the Cornell Soil Health lab determined that AWC, a valuable, but time-intensive measurement, could be accurately predicted. A CASH database containing 7,232 soil samples was used to develop a Random Forest model to predict Field Capacity, Permanent Wilting Point, and AWC from a suite of measured parameters, including % sand, % silt, % clay, Organic Matter, Active Carbon also known as Permanganate Oxidizable Carbon (POxC), Respiration, Wet Aggregate Stability, Potassium, Magnesium, Iron, and Manganese. The Random Forest model was able to explain more variation in AWC than alternative multiple linear regression models. In Spring 2024, the peer reviewed manuscript, "Pedotransfer functions for field capacity, permanent wilting point, and available water capacity based on random forest models for routine soil health analysis" was published. All random forest models and the training dataset are downloadable here.

Notes:

Dataset was compiled from 7,232 samples run through the Cornell Soil Health Laboratory between 2015-2019. Dataset contains texture data (sand, silt, and clay), wet aggregate stability (WAS), soil organic matter (SOM), 4-day soil respiration (Resp), active carbon (AC; this is also referred to as permanganate oxidizable carbon-POxC within the scientific literature), and modified morgan extractable K, Mg, Fe, and Mn in ppm. The dataset also includes field capacity, permanent wilting point, and available water capacity (AWC), which was measured on disturbed soil samples (< 2 mm) that were equilibrated after initial saturation to pressures of -10 kPa and -1500 kPa on porous ceramic pressure plates in pressure chambers (Soil Moisture Equipment Corp., Goleta, CA). Columns include: RowNumber, sand, silt, clay, WAS, SOM, Resp, AC, K, Mg, Fe, Mn, and AWC.

Methodology and Processing

Sources Statement

Data Access

Notes:

This dataset is unavailable for use until the publication of our manuscript.

Other Study Description Materials

Related Publications

Citation

Title:

Amsili, J.P., H.M. van Es and R.R. Schindelbeck. 2024. Pedotransfer Functions for Field Capacity, Permanent Wilting Point, and Available Water Capacity Based on Random Forest Models for Routine Soil Health Analysis. Communications in Soil Science and Plant Analysis 55: 1967-1984. doi:10.1080/00103624.2024.2336573.

Identification Number:

10.1080/00103624.2024.2336573

Bibliographic Citation:

Amsili, J.P., H.M. van Es and R.R. Schindelbeck. 2024. Pedotransfer Functions for Field Capacity, Permanent Wilting Point, and Available Water Capacity Based on Random Forest Models for Routine Soil Health Analysis. Communications in Soil Science and Plant Analysis 55: 1967-1984. doi:10.1080/00103624.2024.2336573.

File Description--f10282456

File: 2024 Tutorial samples to use RF model file.tab

  • Number of cases: 3

  • No. of variables per record: 12

  • Type of File: text/tab-separated-values

Notes:

UNF:6:165Gfin1rPR9XGn56TIr9A==

Variable Description

List of Variables:

Variables

SampleID

f10282456 Location:

Variable Format: character

Notes: UNF:6:4JhmWjz3VEEmKCfID1WOBA==

sand

f10282456 Location:

Summary Statistics: StDev 10.0; Valid 3.0; Min. 20.0; Mean 30.0; Max. 40.0

Variable Format: numeric

Notes: UNF:6:ZY4ABplobHy9OaBlKXhY+g==

silt

f10282456 Location:

Summary Statistics: StDev 15.275252316519467; Valid 3.0; Mean 51.666666666666664; Min. 35.0; Max. 65.0

Variable Format: numeric

Notes: UNF:6:mJwcjndA2xetCO6g+XHFeQ==

clay

f10282456 Location:

Summary Statistics: Valid 3.0; Max. 25.0; Min. 15.0; StDev 5.773502691896257; Mean 18.333333333333332

Variable Format: numeric

Notes: UNF:6:gXkr68rmXCIEBGcWR3eqxg==

WAS

f10282456 Location:

Summary Statistics: Mean 36.666666666666664; StDev 12.583057392117915; Max. 50.0; Min. 25.0; Valid 3.0;

Variable Format: numeric

Notes: UNF:6:M22F9rpxJ9lsgNX8BjIOfA==

OM

f10282456 Location:

Summary Statistics: Valid 3.0; Min. 3.0; Max. 4.0; StDev 0.5; Mean 3.5

Variable Format: numeric

Notes: UNF:6:Oj7peFEzQz4umnz4Utjl6w==

Resp

f10282456 Location:

Summary Statistics: Max. 1.0; Mean 0.8; StDev 0.2; Valid 3.0; Min. 0.6;

Variable Format: numeric

Notes: UNF:6:mEcvmigI2AEfRNy6znWKSw==

AC

f10282456 Location:

Summary Statistics: StDev 100.0; Valid 3.0; Min. 600.0; Mean 700.0; Max. 800.0;

Variable Format: numeric

Notes: UNF:6:OvckdW3bRKctXn7MGeGcRw==

K

f10282456 Location:

Summary Statistics: Min. 70.0; Mean 123.33333333333333; StDev 68.06859285554046; Max. 200.0; Valid 3.0

Variable Format: numeric

Notes: UNF:6:PqeDEpGVBYbh/WJOMen+Cg==

Mg

f10282456 Location:

Summary Statistics: Max. 300.0; Valid 3.0; StDev 100.0; Min. 100.0; Mean 200.0

Variable Format: numeric

Notes: UNF:6:OvwRq3KrumSZY/Ljve0sig==

Fe

f10282456 Location:

Summary Statistics: Max. 4.0; Valid 3.0; Min. 2.0; StDev 1.1547005383792515; Mean 2.6666666666666665;

Variable Format: numeric

Notes: UNF:6:gdMc/doIobc7b+UxDfEGZw==

Mn

f10282456 Location:

Summary Statistics: Mean 5.333333333333333; Min. 3.0; Valid 3.0; Max. 8.0; StDev 2.516611478423583

Variable Format: numeric

Notes: UNF:6:ztquEB9agPqWROQzge2LEw==

Other Study-Related Materials

Label:

2015-2019_CompleteAWC_w.1and15bar_dataset_toshare.xlsx

Text:

Cornell Soil Health Database that was used to train (Training Dataset) the random forest models for field capacity, permanent wilting point, and available water capacity.

Notes:

application/vnd.openxmlformats-officedocument.spreadsheetml.sheet

Other Study-Related Materials

Label:

2020 Randomforest-tutorial-for-AWC.pdf

Text:

PDF generated from Rmarkdown file that describes AWC RF code and model diagnostics.

Notes:

application/pdf

Other Study-Related Materials

Label:

2024 Example tutorial for using AWC RF model file.R

Text:

R script that explains how to bring in RF model files and use them to predict field capacity, permanent wilting point, and available water capacity

Notes:

type/x-r-syntax

Other Study-Related Materials

Label:

Amsili et al 2024 Pedotransfer functions for field capacity permanent wilting point and available water capacity based on random forest modeling w. suppliment.pdf

Text:

Peer-Reviewed Paper: "Pedotransfer Functions for Field Capacity, Permanent Wilting Point, and Available Water Capacity Based on Random Forest Models for Routine Soil Health Analysis"

Notes:

application/pdf

Other Study-Related Materials

Label:

availablewatercapacity_full_randomforestmodel_2023.rda

Text:

Full Random Forest Model for available water capacity (field capacity - permanent wilting point)

Notes:

application/x-rlang-transport

Other Study-Related Materials

Label:

availablewatercapacity_reduced_randomforestmodel_2023.rda

Text:

Reduced Random Forest Model for available water capacity (field capacity - permanent wilting point)

Notes:

application/x-rlang-transport

Other Study-Related Materials

Label:

fieldcapacity_full_randomforestmodel_2023.rda

Text:

Full Random Forest Model for predicting field capacity (-10 kPa)

Notes:

application/x-rlang-transport

Other Study-Related Materials

Label:

fieldcapacity_reduced_randomforestmodel_2023.rda

Text:

Reduced Random Forest Model for predicting field capacity (-10 kPa)

Notes:

application/x-rlang-transport

Other Study-Related Materials

Label:

permanentwiltingpoint_full_randomforestmodel_2023.rda

Text:

Full Random Forest Model for predicting Permanent wilting point (-1500 kPa)

Notes:

application/x-rlang-transport

Other Study-Related Materials

Label:

permanentwiltingpoint_reduced_randomforestmodel_2023.rda

Text:

Reduced Random Forest Model for predicting Permanent wilting point (-1500 kPa)

Notes:

application/x-rlang-transport