If no argument is provided, PigStorage will assume tab-delimited format. If a delimiter argument is provided, it must be a single-byte character; any literal (eg: 'a', '|'), known escape character (eg: '\t', '\r') is a valid delimiter. For example, to load a space-separated file:
data = LOAD 's3n://input-bucket/input-folder' USING PigStorage(' ') AS (field0:chararray, field1:int);
The schema must be provided in the AS clause.
To store data using PigStorage, the same delimiter rules apply:
STORE data INTO 's3n://output-bucket/output-folder' USING PigStorage('\t');
PigStorage is an extremely simple loader that does not handle special cases such as embedded delimiters or escaped control characters; it will split on every instance of the delimiter regardless of context. For this reason, when loading a CSV file it is recommended to use CSVExcelStorage rather than PigStorage with a comma delimiter.