Last updated: 2021-07-19

Purpose

This code creates keys to relate USDA acreage data to USGS crop categories for pesticide data.

Data sources

Data sources for USDA crop acreage data are described in the data extraction code. Information on which crops were surveyed for pesticide use in the USGS dataset is from USGS metadata, Baker & Stone (2015) Appendix 1, and personal communication with Nancy Baker.

Libraries & functions

library(tidyverse)
library(data.table)
library(stringr)

Load data

crop_data <- read.csv("../output_big/nass_survey/qs.crops.ac.nat_20200404.csv")
str(crop_data)

Subset data

Select crop acreage that meets the following conditions:

  • Annual estimate
  • 1992 or later
crop_data_sub <- crop_data %>%
  filter(FREQ_DESC=="ANNUAL"&
          REFERENCE_PERIOD_DESC=="YEAR" &
          (YEAR>1991))

Extract crop names

crop_names <- crop_data_sub %>%
  group_by(SOURCE_DESC, SHORT_DESC, YEAR) %>%
  summarise(n = length(VALUE)) %>%
  spread(key = SOURCE_DESC, value = n)

Data items for major crops

The goal here is to create keys to relate USDA crop data to USGS surveyed crops. There are two keys because of slightly different types of acreage data - only those acres that were harvested, and all acres in the ground (planted). Availability of these items can vary across crops, years, and datasets (Census vs. Survey).

Corn

# look for all data items with "CORN" in the short description
corn_names <- filter(crop_names, str_detect(SHORT_DESC, "CORN"))

Notes

  • Only item with planted acreage is CORN - ACRES PLANTED
    • Only available in the survey (not census)
  • Items for harvested acreage are:
    • CORN, GRAIN - ACRES HARVESTED
    • CORN, SILAGE - ACRES HARVESTED
  • Other items are either sweet corn or subsets of these
# start list of data items for harvested acres
SHORT_DESC <- c("CORN, GRAIN - ACRES HARVESTED",
                     "CORN, SILAGE - ACRES HARVESTED")
harv_key <- as.data.frame(SHORT_DESC, stringsAsFactors = FALSE)

# start list of data items for planted acres
SHORT_DESC <- c("CORN - ACRES PLANTED")
plant_key <- as.data.frame(SHORT_DESC, stringsAsFactors = FALSE)

# add column for USGS surveyed crop name
harv_key$e_pest_name <- "Corn"
plant_key$e_pest_name <- "Corn"

# add column for USGS crop group name
harv_key$USGS_group <- "Corn"
plant_key$USGS_group <- "Corn"

Soybean

# look for all data items with "SOYBEAN" in the short description
soy_names <- filter(crop_names, str_detect(SHORT_DESC, "SOYBEAN"))

Notes

  • Only item with planted acreage is SOYBEANS - ACRES PLANTED
    • Only available in the survey (not census)
  • Harvested acreage is SOYBEANS - ACRES HARVESTED
  • Other items are subsets of these
# add to list of data items for harvested acres
harv_key <- rbind(harv_key,
                         c("SOYBEANS - ACRES HARVESTED",
                           "Soybeans",
                           "Soybeans"))

# add to list of data items for planted acres
plant_key <- rbind(plant_key,
                         c("SOYBEANS - ACRES PLANTED",
                           "Soybeans",
                           "Soybeans"))

Alfalfa

# look for all data items with "ALFALFA" in the short description
alf_names <- filter(crop_names, str_detect(SHORT_DESC, "ALFALFA"))

Notes

  • Planted acreage is not meaningful here because this is a perennial crop
  • There does not seem to be a data item reflecting total land in alfalfa, but harvested acreage should be close
  • Items to cover harvested acres:
    • HAY, ALFALFA - ACRES HARVESTED
    • HAYLAGE, ALFALFA - ACRES HARVESTED
  • Other items are subsets of these or non-alfalfa crops
  • Note that haylage did not start being measured until 2000
# add to list of data items for harvested acres
harv_key <- rbind(harv_key,
                         c("HAY, ALFALFA - ACRES HARVESTED",
                           "Alfalfa",
                           "Alfalfa"),
                  c("HAYLAGE, ALFALFA - ACRES HARVESTED",
                           "Alfalfa",
                           "Alfalfa"))

plant_key <- rbind(plant_key,
                         c(NA,
                           "Alfalfa",
                           "Alfalfa"))

Rice

# look for all data items with "RICE" in the short description
rice_names <- filter(crop_names, str_detect(SHORT_DESC, "RICE"))

Notes

  • Only item with planted acreage is RICE - ACRES PLANTED
    • Only available in the survey (not census)
  • Harvested acres is RICE - ACRES HARVESTED
    • Other harvested acreage items are subsets of these
# add to list of data items for harvested acres
harv_key <- rbind(harv_key,
                         c("RICE - ACRES HARVESTED",
                           "Rice",
                           "Rice"))

# add to list of data items for planted acres
plant_key <- rbind(plant_key,
                         c("RICE - ACRES PLANTED",
                           "Rice",
                           "Rice"))

Cotton

# look for all data items with "COTTON" in the short description
cotton_names <- filter(crop_names, str_detect(SHORT_DESC, "COTTON"))

Notes

  • Only item with planted acreage is COTTON - ACRES PLANTED
    • Only available in the survey (not census)
  • Harvested acreage is COTTON - ACRES HARVESTED
  • Other items are subsets of these
# add to list of data items for harvested acres
harv_key <- rbind(harv_key,
                         c("COTTON - ACRES HARVESTED",
                           "Cotton",
                           "Cotton"))

# add to list of data items for planted acres
plant_key <- rbind(plant_key,
                         c("COTTON - ACRES PLANTED",
                           "Cotton",
                           "Cotton"))

Wheat

# look for all data items with "WHEAT" in the short description
wheat_names <- filter(crop_names, str_detect(SHORT_DESC, "WHEAT"))

Notes

  • Planted acreage is WHEAT - ACRES PLANTED
    • Only available in the survey (not census)
  • Harvested acres is WHEAT - ACRES HARVESTED
  • Other items are subsets of these
    • It appears Baker & Stone (2015) used wheat split out into subcategories (spring, winter, durum), but I’m not sure why
# add to list of data items for harvested acres
harv_key <- rbind(harv_key,
                         c("WHEAT - ACRES HARVESTED",
                           "Wheat",
                           "Wheat"))

# add to list of data items for planted acres
plant_key <- rbind(plant_key,
                         c("WHEAT - ACRES PLANTED",
                           "Wheat",
                           "Wheat"))

Data items for minor crops

Orchards and grapes

This category includes tree crops (fruits, nuts) and grapes. According to Baker & Stone (2015), surveyed crops in this category (for all states except California) included almonds, apples, apricots, cherries, grapefruit, grapes, hazelnuts, lemons, oranges (incl. tangerines, tangelos, and temples), peaches, pears, pecans, pistachios, plums/prunes, and walnuts.

# look for all data items with orchard crops or grapes in the short description
almond_names <- filter(crop_names, str_detect(SHORT_DESC, "ALMOND"))
apple_names <- filter(crop_names, str_detect(SHORT_DESC, "APPLE"))
apricot_names <- filter(crop_names, str_detect(SHORT_DESC, "APRICOT"))
cherry_names <- filter(crop_names, str_detect(SHORT_DESC, "CHERR"))
grapefruit_names <- filter(crop_names, str_detect(SHORT_DESC, "GRAPEFRUIT"))
grape_names <- filter(crop_names, str_detect(SHORT_DESC, "GRAPE"))
hazel_names <- filter(crop_names, str_detect(SHORT_DESC, "HAZEL"))
lemon_names <- filter(crop_names, str_detect(SHORT_DESC, "LEMON"))
orange_names <- filter(crop_names, str_detect(SHORT_DESC, "ORANGE"))
peach_names <- filter(crop_names, str_detect(SHORT_DESC, "PEACH"))
pear_names <- filter(crop_names, str_detect(SHORT_DESC, "PEAR"))
pecan_names <- filter(crop_names, str_detect(SHORT_DESC, "PECAN"))
pistachio_names <- filter(crop_names, str_detect(SHORT_DESC, "PISTACH"))
plum_names <- filter(crop_names, str_detect(SHORT_DESC, "PLUM"))
tangelo_names <- filter(crop_names, str_detect(SHORT_DESC, "TANGELO"))
tangerine_names <- filter(crop_names, str_detect(SHORT_DESC, "TANGERINE"))
temple_names <- filter(crop_names, str_detect(SHORT_DESC, "TEMPLE"))
walnut_names <- filter(crop_names, str_detect(SHORT_DESC, "WALNUT"))

Notes

  • ACRES BEARING treated as harvested acres
    • This item starts in 2002 for most crops in Census; 2007 in Survey
  • ACRES BEARING & NON-BEARING treated as planted acres
    • This item goes back to 1997 for Census; not included in Survey
# add to list of data items for harvested acres
harv_key <- rbind(harv_key,
                         c("ALMONDS - ACRES BEARING",
                           "Almonds",
                           "Orchards_and_grapes"),
                         c("APPLES - ACRES BEARING",
                           "Apples",
                           "Orchards_and_grapes"),
                         c("APRICOTS - ACRES BEARING",
                           "Apricots",
                           "Orchards_and_grapes"),
                         c("CHERRIES, TART - ACRES BEARING",
                           "Cherries",
                           "Orchards_and_grapes"),
                         c("CHERRIES, SWEET - ACRES BEARING",
                           "Cherries",
                           "Orchards_and_grapes"),
                         c("GRAPEFRUIT - ACRES BEARING",
                           "Grapefruit",
                           "Orchards_and_grapes"),
                         c("GRAPES - ACRES BEARING",
                           "Grapes",
                           "Orchards_and_grapes"),
                         c("HAZELNUTS - ACRES BEARING",
                           "Hazelnuts",
                           "Orchards_and_grapes"),
                          c("LEMONS - ACRES BEARING",
                           "Lemons",
                           "Orchards_and_grapes"),
                         c("ORANGES - ACRES BEARING",
                           "Oranges",
                           "Orchards_and_grapes"),
                         c("PEACHES - ACRES BEARING",
                           "Peaches",
                           "Orchards_and_grapes"),
                         c("PEARS - ACRES BEARING",
                           "Pears",
                           "Orchards_and_grapes"),
                         c("PECANS - ACRES BEARING",
                           "Pecans",
                           "Orchards_and_grapes"),
                         c("PISTACHIOS - ACRES BEARING",
                           "Pistachios",
                           "Orchards_and_grapes"),
                         c("PLUMS & PRUNES - ACRES BEARING",
                           "PlumsPrunes",
                           "Orchards_and_grapes"),
                         c("TANGELOS - ACRES BEARING",
                           "Oranges",
                           "Orchards_and_grapes"),
                          c("TANGERINES - ACRES BEARING",
                           "Oranges",
                           "Orchards_and_grapes"),
                         c("TEMPLES - ACRES BEARING",
                           "Oranges",
                           "Orchards_and_grapes"),
                         c("WALNUTS, ENGLISH - ACRES BEARING",
                           "Walnuts",
                           "Orchards_and_grapes"))

# add to list of data items for planted acres
plant_key <- rbind(plant_key,
                         c("ALMONDS - ACRES BEARING & NON-BEARING",
                           "Almonds",
                           "Orchards_and_grapes"),
                         c("APPLES - ACRES BEARING & NON-BEARING",
                           "Apples",
                           "Orchards_and_grapes"),
                         c("APRICOTS - ACRES BEARING & NON-BEARING",
                           "Apricots",
                           "Orchards_and_grapes"),
                         c("CHERRIES, TART - ACRES BEARING & NON-BEARING",
                           "Cherries",
                           "Orchards_and_grapes"),
                         c("CHERRIES, SWEET - ACRES BEARING & NON-BEARING",
                           "Cherries",
                           "Orchards_and_grapes"),
                         c("GRAPEFRUIT - ACRES BEARING & NON-BEARING",
                           "Grapefruit",
                           "Orchards_and_grapes"),
                         c("GRAPES - ACRES BEARING & NON-BEARING",
                           "Grapes",
                           "Orchards_and_grapes"),
                         c("HAZELNUTS - ACRES BEARING & NON-BEARING",
                           "Hazelnuts",
                           "Orchards_and_grapes"),
                         c("LEMONS - ACRES BEARING & NON-BEARING",
                           "Lemons",
                           "Orchards_and_grapes"),
                         c("ORANGES - ACRES BEARING & NON-BEARING",
                           "Oranges",
                           "Orchards_and_grapes"),
                         c("PEACHES - ACRES BEARING & NON-BEARING",
                           "Peaches",
                           "Orchards_and_grapes"),
                          c("PEARS - ACRES BEARING & NON-BEARING",
                           "Pears",
                           "Orchards_and_grapes"),
                         c("PECANS - ACRES BEARING & NON-BEARING",
                           "Pecans",
                           "Orchards_and_grapes"),
                         c("PISTACHIOS - ACRES BEARING & NON-BEARING",
                           "Pistachios",
                           "Orchards_and_grapes"),
                         c("PLUMS & PRUNES - ACRES BEARING & NON-BEARING",
                           "PlumsPrunes",
                           "Orchards_and_grapes"),
                        c("TANGELOS - ACRES BEARING & NON-BEARING",
                           "Oranges",
                           "Orchards_and_grapes"),
                          c("TANGERINES - ACRES BEARING & NON-BEARING",
                           "Oranges",
                           "Orchards_and_grapes"),
                         c("TEMPLES - ACRES BEARING & NON-BEARING",
                           "Oranges",
                           "Orchards_and_grapes"),
                         c("WALNUTS, ENGLISH - ACRES BEARING & NON-BEARING",
                           "Walnuts",
                           "Orchards_and_grapes"))

Vegetables and fruit

This category includes vegetables, melons, and berries. According to Baker & Stone (2015), surveyed crops in this category (for all states except California) included artichokes, asparagus, beans (snap, bush, pole, string), broccoli, cabbage, caneberries, cantaloupes, carrots, cauliflower, celery, cucumbers, dry beans/peas, garlic, lettuce, lima beans, onions, peas, peppers, potatoes, pumpkins, spinach, squash, strawberries, sweet corn, tomatoes, and watermelons.

# look for all data items with orchard crops or grapes in the short description
artichoke_names <- filter(crop_names, str_detect(SHORT_DESC, "ARTICHOKE"))
asparagus_names <- filter(crop_names, str_detect(SHORT_DESC, "ASPARAGUS"))
bean_names <- filter(crop_names, str_detect(SHORT_DESC, "BEAN"))
broccoli_names <- filter(crop_names, str_detect(SHORT_DESC, "BROCCOLI"))
cabbage_names <- filter(crop_names, str_detect(SHORT_DESC, "CABBAGE"))
berry_names <- filter(crop_names, str_detect(SHORT_DESC, "BERRIE"))
melon_names <- filter(crop_names, str_detect(SHORT_DESC, "MELON"))
carrot_names <- filter(crop_names, str_detect(SHORT_DESC, "CARROT"))
cauliflower_names <- filter(crop_names, str_detect(SHORT_DESC, "CAULIFLOWER"))
celery_names <- filter(crop_names, str_detect(SHORT_DESC, "CELERY"))
cuke_names <- filter(crop_names, str_detect(SHORT_DESC, "CUCUMBER"))
dry_bean_names <- filter(crop_names, str_detect(SHORT_DESC, "DRY"))
lentil_names <- filter(crop_names, str_detect(SHORT_DESC, "LENTIL"))
garlic_names <- filter(crop_names, str_detect(SHORT_DESC, "GARLIC"))
lettuce_names <- filter(crop_names, str_detect(SHORT_DESC, "LETTUCE"))
pea_names <- filter(crop_names, str_detect(SHORT_DESC, "PEA"))
onion_names <- filter(crop_names, str_detect(SHORT_DESC, "ONION"))
pepper_names <- filter(crop_names, str_detect(SHORT_DESC, "PEPPER"))
potato_names <- filter(crop_names, str_detect(SHORT_DESC, "POTATO"))
pumpkin_names <- filter(crop_names, str_detect(SHORT_DESC, "PUMPKIN"))
spinach_names <- filter(crop_names, str_detect(SHORT_DESC, "SPINACH"))
squash_names <- filter(crop_names, str_detect(SHORT_DESC, "SQUASH"))
sweetcorn_names <- filter(crop_names, str_detect(SHORT_DESC, "SWEET CORN"))
tomato_names <- filter(crop_names, str_detect(SHORT_DESC, "TOMATO"))

Notes

  • For perennial crops (e.g. asparagus, caneberries, strawberries) did not include ACRES PLANTED because it’s unclear what that would reflect
  • For harvested acres, selected items based on categories used in the Census. Sometimes Census and Survey categories are different. The Census is more reliably available so I am using that.
  • Planted acres are only available in the Survey, so I used Survey categories for planted acres. I attempted to use the combination of Survey data items that most closely resemble the combination of Census data items for a given crop so that the summed acreage is as comparable as possible.
# add to list of data items for harvested acres
harv_key <- rbind(harv_key,
                         c("ARTICHOKES - ACRES HARVESTED",
                           "Artichokes",
                           "Vegetables_and_fruit"),
                  c("ASPARAGUS - ACRES HARVESTED",
                           "Asparagus",
                           "Vegetables_and_fruit"),
                  c("BEANS, DRY EDIBLE, (EXCL LIMA) - ACRES HARVESTED",
                           "DryBeansPeas",
                           "Vegetables_and_fruit"),
                  c("PEAS, DRY EDIBLE - ACRES HARVESTED",
                           "DryBeansPeas",
                           "Vegetables_and_fruit"),
                  c("PEAS, DRY, SOUTHERN (COWPEAS) - ACRES HARVESTED",
                           "DryBeansPeas",
                           "Vegetables_and_fruit"),
                  c("LENTILS - ACRES HARVESTED",
                           "DryBeansPeas",
                           "Vegetables_and_fruit"),
                  c("BEANS, DRY EDIBLE, LIMA - ACRES HARVESTED",
                           "LimaBeans",
                           "Vegetables_and_fruit"),
                  c("BEANS, GREEN, LIMA - ACRES HARVESTED",
                           "LimaBeans",
                           "Vegetables_and_fruit"),
                  c("BEANS, SNAP - ACRES HARVESTED",
                           "BeansSnapBushPoleString",
                           "Vegetables_and_fruit"),
                   c("BROCCOLI - ACRES HARVESTED",
                           "Broccoli",
                           "Vegetables_and_fruit"),
                  c("CABBAGE, CHINESE - ACRES HARVESTED",
                           "Cabbage",
                           "Vegetables_and_fruit"),
                  c("CABBAGE, HEAD - ACRES HARVESTED",
                           "Cabbage",
                           "Vegetables_and_fruit"),
                  c("CABBAGE, MUSTARD - ACRES HARVESTED",
                           "Cabbage",
                           "Vegetables_and_fruit"),
                  c("BLACKBERRIES, INCL DEWBERRIES & MARIONBERRIES - ACRES HARVESTED",
                           "Caneberries",
                           "Vegetables_and_fruit"),
                  c("BOYSENBERRIES - ACRES HARVESTED",
                           "Caneberries",
                           "Vegetables_and_fruit"),
                  c("LOGANBERRIES - ACRES HARVESTED",
                           "Caneberries",
                           "Vegetables_and_fruit"),
                  c("RASPBERRIES - ACRES HARVESTED",
                           "Caneberries",
                           "Vegetables_and_fruit"),
                  c("STRAWBERRIES - ACRES HARVESTED",
                           "Strawberries",
                           "Vegetables_and_fruit"),
                  c("MELONS, CANTALOUP - ACRES HARVESTED",
                           "Cantaloupes",
                           "Vegetables_and_fruit"),
                  c("MELONS, HONEYDEW - ACRES HARVESTED",
                           "Cantaloupes",
                           "Vegetables_and_fruit"),
                  c("MELONS, WATERMELON - ACRES HARVESTED",
                           "Watermelons",
                           "Vegetables_and_fruit"),
                  c("CARROTS - ACRES HARVESTED",
                           "Carrots",
                           "Vegetables_and_fruit"),
                  c("CAULIFLOWER - ACRES HARVESTED",
                           "Cauliflower",
                           "Vegetables_and_fruit"),
                  c("CELERY - ACRES HARVESTED",
                           "Celery",
                           "Vegetables_and_fruit"),
                  c("CUCUMBERS - ACRES HARVESTED",
                           "Cucumbers",
                           "Vegetables_and_fruit"),
                  c("GARLIC - ACRES HARVESTED",
                           "Garlic",
                           "Vegetables_and_fruit"),
                  c("LETTUCE - ACRES HARVESTED",
                           "Lettuce",
                           "Vegetables_and_fruit"),
                  c("ONIONS, DRY - ACRES HARVESTED",
                           "Onions",
                           "Vegetables_and_fruit"),
                  c("ONIONS, GREEN - ACRES HARVESTED",
                           "Onions",
                           "Vegetables_and_fruit"),
                  c("PEAS, GREEN, (EXCL SOUTHERN) - ACRES HARVESTED",
                           "PeasFreshGreenSweet",
                           "Vegetables_and_fruit"),
                  c("PEAS, GREEN, SOUTHERN (COWPEAS) - ACRES HARVESTED",
                           "PeasFreshGreenSweet",
                           "Vegetables_and_fruit"),
                  c("PEAS, CHINESE (SUGAR & SNOW) - ACRES HARVESTED",
                           "PeasFreshGreenSweet",
                           "Vegetables_and_fruit"),
                  c("PEPPERS, BELL - ACRES HARVESTED",
                           "Peppers",
                           "Vegetables_and_fruit"),
                  c("PEPPERS, CHILE - ACRES HARVESTED",
                           "Peppers",
                           "Vegetables_and_fruit"),
                  c("POTATOES - ACRES HARVESTED",
                           "Potatoes",
                           "Vegetables_and_fruit"),
                  c("PUMPKINS - ACRES HARVESTED",
                           "Pumpkins",
                           "Vegetables_and_fruit"),
                  c("SPINACH - ACRES HARVESTED",
                           "Spinach",
                           "Vegetables_and_fruit"),
                  c("SQUASH - ACRES HARVESTED",
                           "Squash",
                           "Vegetables_and_fruit"),
                  c("SWEET CORN - ACRES HARVESTED",
                           "SweetCorn",
                           "Vegetables_and_fruit"),
                  c("SWEET CORN, SEED - ACRES HARVESTED",
                           "SweetCorn",
                           "Vegetables_and_fruit"),
                  c("TOMATOES, IN THE OPEN - ACRES HARVESTED",
                           "Tomatoes",
                           "Vegetables_and_fruit"))

# add to list of data items for planted acres
plant_key <- rbind(plant_key,
                   c("ARTICHOKES - ACRES PLANTED",
                           "Artichokes",
                           "Vegetables_and_fruit"),
                   c(NA,
                           "Asparagus",
                           "Vegetables_and_fruit"),
                   c("BEANS, DRY EDIBLE - ACRES PLANTED",
                           "DryBeansPeas",
                           "Vegetables_and_fruit"),
                   c("PEAS, DRY EDIBLE - ACRES PLANTED",
                           "DryBeansPeas",
                           "Vegetables_and_fruit"),
                   c("LENTILS - ACRES PLANTED",
                           "DryBeansPeas",
                           "Vegetables_and_fruit"),
                   c("BEANS, SNAP, FRESH MARKET - ACRES PLANTED",
                           "BeansSnapBushPoleString",
                           "Vegetables_and_fruit"),
                   c("BEANS, SNAP, PROCESSING - ACRES PLANTED",
                           "BeansSnapBushPoleString",
                           "Vegetables_and_fruit"),
                   c("BEANS, DRY EDIBLE, LIMA, BABY - ACRES PLANTED",
                           "LimaBeans",
                           "Vegetables_and_fruit"),
                   c("BEANS, DRY EDIBLE, LIMA, LARGE - ACRES PLANTED",
                           "LimaBeans",
                           "Vegetables_and_fruit"),
                   c("BEANS, GREEN, LIMA, FRESH MARKET - ACRES PLANTED",
                           "LimaBeans",
                           "Vegetables_and_fruit"),
                   c("BEANS, GREEN, LIMA, PROCESSING - ACRES PLANTED",
                           "LimaBeans",
                           "Vegetables_and_fruit"),
                   c("BROCCOLI - ACRES PLANTED",
                           "Broccoli",
                           "Vegetables_and_fruit"),
                   c("CABBAGE, FRESH MARKET - ACRES PLANTED",
                           "Cabbage",
                           "Vegetables_and_fruit"),
                   c("CABBAGE, PROCESSING - ACRES PLANTED",
                           "Cabbage",
                           "Vegetables_and_fruit"),
                   c(NA,
                           "Caneberries",
                           "Vegetables_and_fruit"),
                   c(NA,
                           "Strawberries",
                           "Vegetables_and_fruit"),
                   c("MELONS, CANTALOUP, FRESH MARKET - ACRES PLANTED",
                           "Cantaloupes",
                           "Vegetables_and_fruit"),
                  c("MELONS, HONEYDEW, FRESH MARKET - ACRES PLANTED",
                           "Cantaloupes",
                           "Vegetables_and_fruit"),
                  c("MELONS, WATERMELON, FRESH MARKET - ACRES PLANTED",
                           "Watermelons",
                           "Vegetables_and_fruit"),
                  c("CARROTS, FRESH MARKET - ACRES PLANTED",
                           "Carrots",
                           "Vegetables_and_fruit"),
                  c("CARROTS, PROCESSING - ACRES PLANTED",
                           "Carrots",
                           "Vegetables_and_fruit"),
                  c("CAULIFLOWER - ACRES PLANTED",
                           "Cauliflower",
                           "Vegetables_and_fruit"),
                  c("CELERY - ACRES PLANTED",
                           "Celery",
                           "Vegetables_and_fruit"),
                  c("CUCUMBERS, FRESH MARKET - ACRES PLANTED",
                           "Cucumbers",
                           "Vegetables_and_fruit"),
                  c("CUCUMBERS, PROCESSING, PICKLES - ACRES PLANTED",
                           "Cucumbers",
                           "Vegetables_and_fruit"),
                  c("GARLIC - ACRES PLANTED",
                           "Garlic",
                           "Vegetables_and_fruit"),
                  c("LETTUCE, HEAD, FRESH MARKET - ACRES PLANTED",
                           "Lettuce",
                           "Vegetables_and_fruit"),
                  c("LETTUCE, LEAF, FRESH MARKET - ACRES PLANTED",
                           "Lettuce",
                           "Vegetables_and_fruit"),
                  c("LETTUCE, ROMAINE, FRESH MARKET - ACRES PLANTED",
                           "Lettuce",
                           "Vegetables_and_fruit"),
                  c("ONIONS, DRY - ACRES PLANTED",
                           "Onions",
                           "Vegetables_and_fruit"),
                  c("ONIONS, GREEN - ACRES PLANTED",
                           "Onions",
                           "Vegetables_and_fruit"),
                  c("PEAS, GREEN, PROCESSING - ACRES PLANTED",
                           "PeasFreshGreenSweet",
                           "Vegetables_and_fruit"),
                  c("PEPPERS, BELL - ACRES PLANTED",
                           "Peppers",
                           "Vegetables_and_fruit"),
                  c("PEPPERS, CHILE - ACRES PLANTED",
                           "Peppers",
                           "Vegetables_and_fruit"),
                  c("POTATOES - ACRES PLANTED",
                           "Potatoes",
                           "Vegetables_and_fruit"),
                  c("PUMPKINS - ACRES PLANTED",
                           "Pumpkins",
                           "Vegetables_and_fruit"),
                  c("SPINACH, FRESH MARKET - ACRES PLANTED",
                           "Spinach",
                           "Vegetables_and_fruit"),
                  c("SPINACH, PROCESSING - ACRES PLANTED",
                           "Spinach",
                           "Vegetables_and_fruit"),
                  c("SQUASH - ACRES PLANTED",
                           "Squash",
                           "Vegetables_and_fruit"),
                  c("SWEET CORN, FRESH MARKET - ACRES PLANTED",
                           "SweetCorn",
                           "Vegetables_and_fruit"),
                  c("SWEET CORN, PROCESSING - ACRES PLANTED",
                           "SweetCorn",
                           "Vegetables_and_fruit"),
                  c("TOMATOES, IN THE OPEN, PROCESSING - ACRES PLANTED",
                           "Tomatoes",
                           "Vegetables_and_fruit"),
                  c("TOMATOES, IN THE OPEN, FRESH MARKET - ACRES PLANTED",
                           "Tomatoes",
                           "Vegetables_and_fruit"))

Other crops

This category includes assorted crops that do not fit into other categories. According to Baker & Stone (2015), surveyed crops in this category (for all states except California) included barley, canola (oilseed rape), peanuts, sorghum, sugar beets, sugarcane, sunflowers, and tobacco.

# look for all data items with other crops in the short description
barley_names <- filter(crop_names, str_detect(SHORT_DESC, "BARLEY"))
canola_names <- filter(crop_names, str_detect(SHORT_DESC, "CANOLA"))
rapeseed_names <- filter(crop_names, str_detect(SHORT_DESC, "RAPESEED"))
peanut_names <- filter(crop_names, str_detect(SHORT_DESC, "PEANUT"))
sorghum_names <- filter(crop_names, str_detect(SHORT_DESC, "SORGHUM"))
sugar_beet_names <- filter(crop_names, str_detect(SHORT_DESC, "BEET"))
sugarcane_names <- filter(crop_names, str_detect(SHORT_DESC, "CANE"))
sunflower_names <- filter(crop_names, str_detect(SHORT_DESC, "SUNFLOWER"))
tobacco_names <- filter(crop_names, str_detect(SHORT_DESC, "TOBACCO"))

Notes

  • For sugarbeets and sugarcane, included crops harvested for sugar or seed
  • For sugarcane, since a perennial, did not include in planted acres
  • There are no planted acres for tobacco for some reason
# add to list of data items for harvested acres
harv_key <- rbind(harv_key,
                         c("BARLEY - ACRES HARVESTED",
                           "Barley",
                           "Other_crops"),
                         c("CANOLA - ACRES HARVESTED",
                           "CanolaOilseedRape",
                           "Other_crops"),
                  c("RAPESEED - ACRES HARVESTED",
                           "CanolaOilseedRape",
                           "Other_crops"),
                        c("PEANUTS - ACRES HARVESTED",
                           "Peanuts",
                           "Other_crops"),
                  c("SORGHUM, GRAIN - ACRES HARVESTED",
                           "SorghumMilo",
                           "Other_crops"),
                  c("SORGHUM, SILAGE - ACRES HARVESTED",
                           "SorghumMilo",
                           "Other_crops"),
                  c("SORGHUM, SYRUP - ACRES HARVESTED",
                           "SorghumMilo",
                           "Other_crops"),
                  c("SUGARBEETS - ACRES HARVESTED",
                           "SugarBeets",
                           "Other_crops"),
                  c("SUGARBEETS, SEED - ACRES HARVESTED",
                           "SugarBeets",
                           "Other_crops"),
                  c("SUGARCANE, SUGAR - ACRES HARVESTED",
                           "Sugarcane",
                           "Other_crops"),
                  c("SUGARCANE, SEED - ACRES HARVESTED",
                           "Sugarcane",
                           "Other_crops"),
                  c("SUNFLOWER - ACRES HARVESTED",
                           "Sunflowers",
                           "Other_crops"),
                  c("TOBACCO - ACRES HARVESTED",
                           "Tobacco",
                           "Other_crops"))
                  
                   
# add to list of data items for planted acres
plant_key <- rbind(plant_key,
                         c("BARLEY - ACRES PLANTED",
                           "Barley",
                           "Other_crops"),
                          c("CANOLA - ACRES PLANTED",
                           "CanolaOilseedRape",
                           "Other_crops"),
                  c("RAPESEED - ACRES PLANTED",
                           "CanolaOilseedRape",
                           "Other_crops"),
                          c("PEANUTS - ACRES PLANTED",
                           "Peanuts",
                           "Other_crops"),
                        c("SORGHUM - ACRES PLANTED",
                           "SorghumMilo",
                           "Other_crops"),
                   c("SUGARBEETS - ACRES PLANTED",
                           "SugarBeets",
                           "Other_crops"),
                   c(NA,
                           "Sugarcane",
                           "Other_crops"),
                  c("SUNFLOWER - ACRES PLANTED",
                           "Sunflowers",
                           "Other_crops"),
                  c(NA,
                           "Tobacco",
                           "Other_crops"))

Pasture and hay

This category includes pasture and non-alfalfa hay crops. According to Baker & Stone (2015), surveyed crops in this category included pastureland and fallow land. In personal communication Nancy Baker also suggested including non-alfalfa hay. Pastureland and fallow land are recorded in a different dataset from the Census, so this code will focus on the items in the crop dataset (non-alfalfa hay).

# look for all data items with hay in the short description
hay_names <- filter(crop_names, str_detect(SHORT_DESC, "HAY"))

Notes

  • Harvested acres for non-alfalfa hay appears to be split into several Census categories
  • Planted/total acres not possible because these are often perennial crops
  • The Survey has an item called HAY & HAYLAGE, (EXCL ALFALFA) but it is not reported in the Census
# add to list of data items for harvested acres
harv_key <- rbind(harv_key,
                         c("HAY, SMALL GRAIN - ACRES HARVESTED",
                           "NonAlfalfaHay",
                           "Pasture_and_hay"),
                  c("HAY, TAME, (EXCL ALFALFA & SMALL GRAIN) - ACRES HARVESTED",
                           "NonAlfalfaHay",
                           "Pasture_and_hay"),
                  c("HAY, WILD - ACRES HARVESTED",
                           "NonAlfalfaHay",
                           "Pasture_and_hay"),
                  c("HAYLAGE, (EXCL ALFALFA) - ACRES HARVESTED",
                           "NonAlfalfaHay",
                           "Pasture_and_hay"))

# add to list of data items for planted acres
plant_key <- rbind(plant_key,
                         c(NA,
                           "NonAlfalfaHay",
                           "Pasture_and_hay"))

Summary

  • USGS and USDA crop names were matched based on information in Baker & Stone (2015), Appendix 1, Table 1 and conversations with Nancy Baker, as well as USDA documentation
  • Harvested acreage:
    • Available every 5 years in the Census for all crops
    • Available more frequently from the Survey for major crops (corn, soy, wheat, cotton, rice, alfalfa)
  • Planted/total acreage:
    • Not attempted for perennial crops, except orchards (no data or too complicated)
    • For annual crops, planted acreage is only reported in the Survey
    • For orchard crops, total acreage is considered bearing + non-bearing acreage, which is reported in the Census. Harvested acreage (bearing) is only consistently available 2002 onward.
  • There are some cases for which Census and Survey data do not match exactly because different crop categories are used - tried to match them as well as possible
  • The ‘summary’ key tries to match as well as possible methods in Baker & Stone (2015), as well as maximizing data coverage
    • Harvested acreage used, with priority to Census data
    • For orchard crops, used bearing + non-bearing to maximize coverage

Create summary key

summary_key <- harv_key %>%
  filter(USGS_group!="Orchards_and_grapes") %>%
  rbind(filter(plant_key, USGS_group=="Orchards_and_grapes"))

# Add remaining data items from the 'economic' part of the Census for pastureland category
summary_key <- rbind(summary_key,
                         c("AG LAND, PASTURELAND - ACRES",
                           "Pasture",
                           "Pasture_and_hay"),
                         c("AG LAND, CROPLAND, (EXCL HARVESTED & PASTURED), CULTIVATED SUMMER FALLOW - ACRES",
                           "Fallow",
                           "Pasture_and_hay"))

Check keys

# make sure all data items were entered correctly by joining back to original data item names (only ones that shouldn't match are items from 'economic' dataset, not crops)
crop_names_short <- as.data.frame(unique(crop_names$SHORT_DESC))
crop_names_short$test <- "test"
names(crop_names_short) <- c("SHORT_DESC", "test")
harv_check <- left_join(harv_key, crop_names_short, by="SHORT_DESC")
plant_check <- left_join(plant_key, crop_names_short, by="SHORT_DESC")
summary_check <- left_join(summary_key, crop_names_short, by="SHORT_DESC")

Export keys

write.csv(harv_key, "../keys/crop_key_harv.csv", row.names=FALSE)
write.csv(plant_key, "../keys/crop_key_plant.csv", row.names=FALSE)
write.csv(summary_key, "../keys/crop_key_summary.csv", row.names=FALSE)

Session information

sessionInfo()
R version 3.6.1 (2019-07-05)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS High Sierra 10.13.6

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] data.table_1.14.0 forcats_0.4.0     stringr_1.4.0    
 [4] dplyr_0.8.3       purrr_0.3.2       readr_1.3.1      
 [7] tidyr_1.1.0       tibble_2.1.3      ggplot2_3.2.0    
[10] tidyverse_1.2.1  

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.6       cellranger_1.1.0 pillar_1.4.2     compiler_3.6.1  
 [5] tools_3.6.1      digest_0.6.20    lubridate_1.7.4  jsonlite_1.6    
 [9] evaluate_0.14    lifecycle_0.2.0  nlme_3.1-140     gtable_0.3.0    
[13] lattice_0.20-38  pkgconfig_2.0.2  rlang_0.4.7      cli_1.1.0       
[17] rstudioapi_0.10  yaml_2.2.0       haven_2.1.1      xfun_0.8        
[21] withr_2.1.2      xml2_1.2.0       httr_1.4.2       knitr_1.23      
[25] hms_0.5.0        generics_0.0.2   vctrs_0.3.2      grid_3.6.1      
[29] tidyselect_1.1.0 glue_1.3.1       R6_2.4.0         readxl_1.3.1    
[33] rmarkdown_1.14   modelr_0.1.4     magrittr_1.5     backports_1.1.4 
[37] scales_1.0.0     htmltools_0.3.6  rvest_0.3.4      assertthat_0.2.1
[41] colorspace_1.4-1 stringi_1.4.3    lazyeval_0.2.2   munsell_0.5.0   
[45] broom_0.5.2      crayon_1.3.4    

This R Markdown site was created with workflowr