Last updated: 2021-07-19
This code creates keys to relate USDA acreage data to USGS crop categories for pesticide data for California. The USGS pesticide data is derived from a different dataset than the rest of the country (CA Pesticide Use Reporting), in which a greater range of crops are included than in the main dataset used by USGS (survey data from Kynetec). Because the CA-PUR data is reported in the form of use rate, USGS did not need to estimate crop acreage for this state. So, I am attempting to match CA-PUR crops to USDA crops to the best extent possible.
Data sources for USDA crop acreage data are described in the data extraction code. Information on which crops are included in the CA-PUR dataset is from USGS metadata, Baker & Stone (2015) Appendix 1, and personal communication with Nancy Baker.
library(tidyverse)
library(data.table)
library(stringr)
# Acreage data
crop_data <- read.csv("../output_big/nass_survey/qs.crops.ac.nat_20200404.csv")
str(crop_data)
# Load keys created for the rest of the continental US
harv_key <- read.csv("../keys/crop_key_harv.csv", stringsAsFactors=FALSE)
str(harv_key)
plant_key <- read.csv("../keys/crop_key_plant.csv", stringsAsFactors=FALSE)
str(plant_key)
Select crop acreage that meets the following conditions:
crop_data_sub <- crop_data %>%
filter(FREQ_DESC=="ANNUAL"&
REFERENCE_PERIOD_DESC=="YEAR" &
(YEAR>1991))
crop_names <- crop_data_sub %>%
group_by(SOURCE_DESC, SHORT_DESC, YEAR) %>%
summarise(n = length(VALUE)) %>%
spread(key = SOURCE_DESC, value = n)
The goal here is to create keys to relate USDA crop data to USGS surveyed crops. There are two keys because of slightly different types of acreage data - only those acres that were harvested, and all acres in the ground (planted). Availability of these items can vary across crops, years, and datasets (Census vs. Survey).
Notes
This category includes tree crops (fruits, nuts) and grapes. According to the metadata associated with the USGS dataset, in addition to those crops represented in the rest of the country, in California this category includes chestnuts, dates, figs, kiwifruit, kumquats, limes, mangoes, olives, citrus, other; nut trees, other; pome fruit, other; stone fruit, papayas, and persimmons
# look for all data items with orchard crops or grapes in the short description
chestnut_names <- filter(crop_names, str_detect(SHORT_DESC, "CHESTNUT"))
date_names <- filter(crop_names, str_detect(SHORT_DESC, "DATE"))
fig_names <- filter(crop_names, str_detect(SHORT_DESC, "FIG"))
kiwi_names <- filter(crop_names, str_detect(SHORT_DESC, "KIWI"))
kumquat_names <- filter(crop_names, str_detect(SHORT_DESC, "KUMQUAT"))
lime_names <- filter(crop_names, str_detect(SHORT_DESC, "LIME"))
mango_names <- filter(crop_names, str_detect(SHORT_DESC, "MANGO"))
olive_names <- filter(crop_names, str_detect(SHORT_DESC, "OLIVE"))
papaya_names <- filter(crop_names, str_detect(SHORT_DESC, "PAPAYA"))
persimmon_names <- filter(crop_names, str_detect(SHORT_DESC, "PERSIMMON"))
citrus_names <- filter(crop_names, str_detect(SHORT_DESC, "CITRUS, OTHER"))
nut_names <- filter(crop_names, str_detect(SHORT_DESC, "TREE NUTS, OTHER"))
stone_names <- filter(crop_names, str_detect(SHORT_DESC, "STONE"))
Notes
ACRES BEARING
treated as harvested acres
ACRES BEARING & NON-BEARING
treated as planted acres
STONE FRUITS, OTHER
does not appear to be a data item in the Census or Survey.# add to list of data items for harvested acres
harv_key_CA <- rbind(harv_key,
c("CHESTNUTS - ACRES BEARING",
"Chestnuts",
"Orchards_and_grapes"),
c("CITRUS, OTHER - ACRES BEARING",
"OtherCitrusFruit",
"Orchards_and_grapes"),
c("DATES - ACRES BEARING",
"Dates",
"Orchards_and_grapes"),
c("FIGS - ACRES BEARING",
"Figs",
"Orchards_and_grapes"),
c("KIWIFRUIT - ACRES BEARING",
"Kiwifruit",
"Orchards_and_grapes"),
c("KUMQUATS - ACRES BEARING",
"Kumquats",
"Orchards_and_grapes"),
c("LIMES - ACRES BEARING",
"Limes",
"Orchards_and_grapes"),
c("MANGOES - ACRES BEARING",
"Mangoes",
"Orchards_and_grapes"),
c("TREE NUTS, OTHER - ACRES BEARING",
"OtherNuts",
"Orchards_and_grapes"),
c("OLIVES - ACRES BEARING",
"Olives",
"Orchards_and_grapes"),
c("PAPAYAS - ACRES BEARING",
"Papayas",
"Orchards_and_grapes"),
c("PERSIMMONS - ACRES BEARING",
"Persimmons",
"Orchards_and_grapes"))
# add to list of data items for planted acres
plant_key_CA <- rbind(plant_key,
c("CHESTNUTS - ACRES BEARING & NON-BEARING",
"Chestnuts",
"Orchards_and_grapes"),
c("CITRUS, OTHER - ACRES BEARING & NON-BEARING",
"OtherCitrusFruit",
"Orchards_and_grapes"),
c("DATES - ACRES BEARING & NON-BEARING",
"Dates",
"Orchards_and_grapes"),
c("FIGS - ACRES BEARING & NON-BEARING",
"Figs",
"Orchards_and_grapes"),
c("KIWIFRUIT - ACRES BEARING & NON-BEARING",
"Kiwifruit",
"Orchards_and_grapes"),
c("KUMQUATS - ACRES BEARING & NON-BEARING",
"Kumquats",
"Orchards_and_grapes"),
c("LIMES - ACRES BEARING & NON-BEARING",
"Limes",
"Orchards_and_grapes"),
c("MANGOES - ACRES BEARING & NON-BEARING",
"Mangoes",
"Orchards_and_grapes"),
c("TREE NUTS, OTHER - ACRES BEARING & NON-BEARING",
"OtherNuts",
"Orchards_and_grapes"),
c("OLIVES - ACRES BEARING & NON-BEARING",
"Olives",
"Orchards_and_grapes"),
c("PAPAYAS - ACRES BEARING & NON-BEARING",
"Papayas",
"Orchards_and_grapes"),
c("PERSIMMONS - ACRES BEARING & NON-BEARING",
"Persimmons",
"Orchards_and_grapes"))
This category includes vegetables, melons, and berries. According to the metadata associated with the USGS dataset, in addition to those crops represented in the rest of the country, in California this category includes avocados, beets, blueberries; tame, brussels sprouts, bulb crops, chicory, cole crops, collards, cranberries, currants, daikon, eggplant, escarole or endive, ginger root, guavas, herbs, horseradish, kale, mustard greens, okra, other; non-citrus fruit, other; leafy vegetables, other; roots or tubers, other; vegetables, parsley, pineapples, radishes, rhubarb, sweet potatoes, and turnips.
# look for all data items with orchard crops or grapes in the short description
avocado_names <- filter(crop_names, str_detect(SHORT_DESC, "AVOCADO"))
beet_names <- filter(crop_names, str_detect(SHORT_DESC, "BEET"))
blueberry_names <- filter(crop_names, str_detect(SHORT_DESC, "BLUEBERR"))
brussel_names <- filter(crop_names, str_detect(SHORT_DESC, "BRUSSEL"))
bulb_names <- filter(crop_names, str_detect(SHORT_DESC, "BULB"))
chicory_names <- filter(crop_names, str_detect(SHORT_DESC, "CHICORY"))
cole_names <- filter(crop_names, str_detect(SHORT_DESC, "COLE"))
collard_names <- filter(crop_names, str_detect(SHORT_DESC, "COLLARD"))
cranberry_names <- filter(crop_names, str_detect(SHORT_DESC, "CRANBERR"))
currant_names <- filter(crop_names, str_detect(SHORT_DESC, "CURRANT"))
daikon_names <- filter(crop_names, str_detect(SHORT_DESC, "DAIKON"))
eggplant_names <- filter(crop_names, str_detect(SHORT_DESC, "EGGPLANT"))
escarole_names <- filter(crop_names, str_detect(SHORT_DESC, "ESCAROLE"))
endive_names <- filter(crop_names, str_detect(SHORT_DESC, "ENDIVE"))
ginger_names <- filter(crop_names, str_detect(SHORT_DESC, "GINGER"))
guava_names <- filter(crop_names, str_detect(SHORT_DESC, "GUAVA"))
herb_names <- filter(crop_names, str_detect(SHORT_DESC, "HERB"))
horseradish_names <- filter(crop_names, str_detect(SHORT_DESC, "HORSERADISH"))
kale_names <- filter(crop_names, str_detect(SHORT_DESC, "KALE"))
mustard_names <- filter(crop_names, str_detect(SHORT_DESC, "MUSTARD"))
okra_names <- filter(crop_names, str_detect(SHORT_DESC, "OKRA"))
non_citrus_names <- filter(crop_names, str_detect(SHORT_DESC, "NON-CITRUS"))
leafy_names <- filter(crop_names, str_detect(SHORT_DESC, "LEAFY"))
tuber_names <- filter(crop_names, str_detect(SHORT_DESC, "TUBER"))
vegetable <- filter(crop_names, str_detect(SHORT_DESC, "VEGETABLES, OTHER"))
parsley_names <- filter(crop_names, str_detect(SHORT_DESC, "PARSLEY"))
pineapple_names <- filter(crop_names, str_detect(SHORT_DESC, "PINEAPPLE"))
radish_names <- filter(crop_names, str_detect(SHORT_DESC, "RADISH"))
rhubarb_names <- filter(crop_names, str_detect(SHORT_DESC, "RHUBARB"))
sweet_potato <- filter(crop_names, str_detect(SHORT_DESC, "SWEET POTATO"))
turnip_names <- filter(crop_names, str_detect(SHORT_DESC, "TURNIP"))
Notes
ACRES PLANTED
because it’s unclear what that would reflect# add to list of data items for harvested acres
harv_key_CA <- rbind(harv_key_CA,
c("AVOCADOS - ACRES BEARING",
"Avocados",
"Vegetables_and_fruit"),
c("BEETS - ACRES HARVESTED",
"Beets",
"Vegetables_and_fruit"),
c("BLUEBERRIES, TAME - ACRES HARVESTED",
"BlueberriesTame",
"Vegetables_and_fruit"),
c("BRUSSELS SPROUTS - ACRES HARVESTED",
"BrusselsSprouts",
"Vegetables_and_fruit"),
c("CHICORY - ACRES HARVESTED",
"Chicory",
"Vegetables_and_fruit"),
c("GREENS, COLLARD - ACRES HARVESTED",
"Collards",
"Vegetables_and_fruit"),
c("CRANBERRIES - ACRES HARVESTED",
"Cranberries",
"Vegetables_and_fruit"),
c("CURRANTS - ACRES HARVESTED",
"Currants",
"Vegetables_and_fruit"),
c("DAIKON - ACRES HARVESTED",
"BeansSnapBushPoleString",
"Vegetables_and_fruit"),
c("EGGPLANT - ACRES HARVESTED",
"Eggplant",
"Vegetables_and_fruit"),
c("ESCAROLE & ENDIVE - ACRES HARVESTED",
"EscaroleAndEndive",
"Vegetables_and_fruit"),
c("GINGER ROOT - ACRES HARVESTED",
"GingerRoot",
"Vegetables_and_fruit"),
c("GUAVAS - ACRES BEARING",
"Guavas",
"Vegetables_and_fruit"),
c("HERBS, DRY - ACRES HARVESTED",
"Herbs",
"Vegetables_and_fruit"),
c("HERBS, FRESH CUT - ACRES HARVESTED",
"Herbs",
"Vegetables_and_fruit"),
c("HORSERADISH - ACRES HARVESTED",
"Horseradish",
"Vegetables_and_fruit"),
c("GREENS, KALE - ACRES HARVESTED",
"Kale",
"Vegetables_and_fruit"),
c("GREENS, MUSTARD - ACRES HARVESTED",
"MustardGreens",
"Vegetables_and_fruit"),
c("NON-CITRUS, OTHER, (EXCL BERRIES) - ACRES BEARING",
"OtherNonCitrusFruit",
"Vegetables_and_fruit"),
c("OKRA - ACRES HARVESTED",
"Okra",
"Vegetables_and_fruit"),
c("PARSLEY - ACRES HARVESTED",
"Parsley",
"Vegetables_and_fruit"),
c("PINEAPPLE - ACRES HARVESTED",
"Pineapple",
"Vegetables_and_fruit"),
c("RADISHES - ACRES HARVESTED",
"Radishes",
"Vegetables_and_fruit"),
c("RHUBARB - ACRES HARVESTED",
"Rhubarb",
"Vegetables_and_fruit"),
c("SWEET POTATOES - ACRES HARVESTED",
"SweetPotatoes",
"Vegetables_and_fruit"),
c("TURNIPS - ACRES HARVESTED",
"Turnips",
"Vegetables_and_fruit"))
# add to list of data items for planted acres
plant_key_CA <- rbind(plant_key_CA,
c("AVOCADOS - ACRES BEARING & NON-BEARING",
"Avocados",
"Vegetables_and_fruit"),
c("BEETS, PROCESSING - ACRES PLANTED",
"Beets",
"Vegetables_and_fruit"),
c(NA,
"BlueberriesTame",
"Vegetables_and_fruit"),
c("BRUSSELS SPROUTS - ACRES PLANTED",
"BrusselsSprouts",
"Vegetables_and_fruit"),
c(NA,
"Chicory",
"Vegetables_and_fruit"),
c("GREENS, COLLARD - ACRES PLANTED",
"Collards",
"Vegetables_and_fruit"),
c(NA,
"Cranberries",
"Vegetables_and_fruit"),
c(NA,
"Currants",
"Vegetables_and_fruit"),
c(NA,
"Daikon",
"Vegetables_and_fruit"),
c("EGGPLANT, FRESH MARKET - ACRES PLANTED",
"Eggplant",
"Vegetables_and_fruit"),
c("ESCAROLE & ENDIVE, FRESH MARKET - ACRES PLANTED",
"EscaroleAndEndive",
"Vegetables_and_fruit"),
c(NA,
"GingerRoot",
"Vegetables_and_fruit"),
c("GUAVAS - ACRES BEARING & NON-BEARING",
"Guavas",
"Vegetables_and_fruit"),
c(NA,
"Herbs",
"Vegetables_and_fruit"),
c(NA,
"Horseradish",
"Vegetables_and_fruit"),
c("GREENS, KALE - ACRES PLANTED",
"Kale",
"Vegetables_and_fruit"),
c("GREENS, MUSTARD - ACRES PLANTED",
"MustardGreens",
"Vegetables_and_fruit"),
c("NON-CITRUS, OTHER, (EXCL BERRIES) - ACRES BEARING & NON-BEARING",
"OtherNonCitrusFruit",
"Vegetables_and_fruit"),
c("OKRA - ACRES PLANTED",
"Okra",
"Vegetables_and_fruit"),
c(NA,
"Parsley",
"Vegetables_and_fruit"),
c(NA,
"Pineapple",
"Vegetables_and_fruit"),
c("RADISHES - ACRES PLANTED",
"Radishes",
"Vegetables_and_fruit"),
c(NA,
"Rhubarb",
"Vegetables_and_fruit"),
c("SWEET POTATOES - ACRES PLANTED",
"SweetPotatoes",
"Vegetables_and_fruit"),
c(NA,
"Turnips",
"Vegetables_and_fruit"))
This category includes assorted crops that do not fit into other categories. According to the metadata, in addition to those crops included in the national dataset, this category includes flax or flaxseed, grass seed crops, hops, jojoba, mustard seed, oats or rye for grain, safflower, sesame, taro, triticale, wild rice, and woodland crops.
# look for all data items with other crops in the short description
flax_names <- filter(crop_names, str_detect(SHORT_DESC, "FLAX"))
grass_names <- filter(crop_names, str_detect(SHORT_DESC, "GRASS"))
hops_names <- filter(crop_names, str_detect(SHORT_DESC, "HOPS"))
jojoba_names <- filter(crop_names, str_detect(SHORT_DESC, "JOJOBA"))
mustard_names <- filter(crop_names, str_detect(SHORT_DESC, "MUSTARD"))
oats_names <- filter(crop_names, str_detect(SHORT_DESC, "OATS"))
rye_names <- filter(crop_names, str_detect(SHORT_DESC, "RYE"))
safflower_names <- filter(crop_names, str_detect(SHORT_DESC, "SAFFLOWER"))
sesame_names <- filter(crop_names, str_detect(SHORT_DESC, "SESAME"))
taro_names <- filter(crop_names, str_detect(SHORT_DESC, "TARO"))
triticale_names <- filter(crop_names, str_detect(SHORT_DESC, "TRITICALE"))
wild_rice_names <- filter(crop_names, str_detect(SHORT_DESC, "WILD"))
tree_names <- filter(crop_names, str_detect(SHORT_DESC, "TREE"))
Notes
# add to list of data items for harvested acres
harv_key_CA <- rbind(harv_key_CA,
c("FLAXSEED - ACRES HARVESTED",
"Flaxseed",
"Other_crops"),
c("GRASSES, BERMUDA GRASS, SEED - ACRES HARVESTED",
"FieldAndGrassSeedCropsAll",
"Other_crops"),
c("GRASSES, SUDANGRASS, SEED - ACRES HARVESTED",
"FieldAndGrassSeedCropsAll",
"Other_crops"),
c("HOPS - ACRES HARVESTED",
"Hops",
"Other_crops"),
c("JOJOBA - ACRES HARVESTED",
"Jojoba",
"Other_crops"),
c("MUSTARD - ACRES HARVESTED",
"MustardSeed",
"Other_crops"),
c("OATS - ACRES HARVESTED",
"OatsForGrain",
"Other_crops"),
c("RYE - ACRES HARVESTED",
"RyeForGrain",
"Other_crops"),
c("SAFFLOWER - ACRES HARVESTED",
"Safflower",
"Other_crops"),
c("SUGARCANE, SUGAR - ACRES HARVESTED",
"Sugarcane",
"Other_crops"),
c("SUGARCANE, SEED - ACRES HARVESTED",
"Sugarcane",
"Other_crops"),
c("TARO - ACRES HARVESTED",
"Taro",
"Other_crops"),
c("TRITICALE - ACRES HARVESTED",
"Triticale",
"Other_crops"),
c("WILD RICE - ACRES HARVESTED",
"WildRice",
"Other_crops"),
c("CUT CHRISTMAS TREES - ACRES IN PRODUCTION",
"OtherCrops",
"Other_crops"))
# add to list of data items for planted acres
plant_key_CA <- rbind(plant_key_CA,
c("FLAXSEED - ACRES PLANTED",
"Flaxseed",
"Other_crops"),
c(NA,
"FieldAndGrassSeedCropsAll",
"Other_crops"),
c(NA,
"Hops",
"Other_crops"),
c(NA,
"Jojoba",
"Other_crops"),
c("MUSTARD - ACRES PLANTED",
"MustardSeed",
"Other_crops"),
c("OATS - ACRES PLANTED",
"OatsForGrain",
"Other_crops"),
c("RYE - ACRES PLANTED",
"RyeForGrain",
"Other_crops"),
c("SAFFLOWER - ACRES PLANTED",
"Safflower",
"Other_crops"),
c(NA,
"Taro",
"Other_crops"),
c(NA,
"Triticale",
"Other_crops"),
c(NA,
"WildRice",
"Other_crops"),
c(NA,
"OtherCrops",
"Other_crops"))
This category includes pasture and non-alfalfa hay crops.
Notes
Pasture_and_hay
summary_key_CA <- harv_key_CA %>%
filter(!str_detect(SHORT_DESC,"BEARING")) %>%
rbind(filter(plant_key_CA, str_detect(SHORT_DESC,"BEARING")))
# Add remaining data items from the 'economic' part of the Census for pastureland category
summary_key_CA <- rbind(summary_key_CA,
c("AG LAND, PASTURELAND - ACRES",
"Pasture",
"Pasture_and_hay"),
c("AG LAND, CROPLAND, (EXCL HARVESTED & PASTURED), CULTIVATED SUMMER FALLOW - ACRES",
"Fallow",
"Pasture_and_hay"))
# make sure all data items were entered correctly by joining back to original data item names (only ones that shouldn't match are items from 'economic' dataset, not crops)
crop_names_short <- as.data.frame(unique(crop_names$SHORT_DESC))
crop_names_short$test <- "test"
names(crop_names_short) <- c("SHORT_DESC", "test")
harv_check <- left_join(harv_key_CA, crop_names_short, by="SHORT_DESC")
plant_check <- left_join(plant_key_CA, crop_names_short, by="SHORT_DESC")
summary_check <- left_join(summary_key_CA, crop_names_short, by="SHORT_DESC")
write.csv(harv_key_CA, "../keys/crop_key_harv_CA.csv", row.names=FALSE)
write.csv(plant_key_CA, "../keys/crop_key_plant_CA.csv", row.names=FALSE)
write.csv(summary_key_CA, "../keys/crop_key_summary_CA.csv", row.names=FALSE)
sessionInfo()
R version 3.6.1 (2019-07-05)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS High Sierra 10.13.6
Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] data.table_1.14.0 forcats_0.4.0 stringr_1.4.0
[4] dplyr_0.8.3 purrr_0.3.2 readr_1.3.1
[7] tidyr_1.1.0 tibble_2.1.3 ggplot2_3.2.0
[10] tidyverse_1.2.1
loaded via a namespace (and not attached):
[1] Rcpp_1.0.6 cellranger_1.1.0 pillar_1.4.2 compiler_3.6.1
[5] tools_3.6.1 digest_0.6.20 lubridate_1.7.4 jsonlite_1.6
[9] evaluate_0.14 lifecycle_0.2.0 nlme_3.1-140 gtable_0.3.0
[13] lattice_0.20-38 pkgconfig_2.0.2 rlang_0.4.7 cli_1.1.0
[17] rstudioapi_0.10 yaml_2.2.0 haven_2.1.1 xfun_0.8
[21] withr_2.1.2 xml2_1.2.0 httr_1.4.2 knitr_1.23
[25] hms_0.5.0 generics_0.0.2 vctrs_0.3.2 grid_3.6.1
[29] tidyselect_1.1.0 glue_1.3.1 R6_2.4.0 readxl_1.3.1
[33] rmarkdown_1.14 modelr_0.1.4 magrittr_1.5 backports_1.1.4
[37] scales_1.0.0 htmltools_0.3.6 rvest_0.3.4 assertthat_0.2.1
[41] colorspace_1.4-1 stringi_1.4.3 lazyeval_0.2.2 munsell_0.5.0
[45] broom_0.5.2 crayon_1.3.4
This R Markdown site was created with workflowr