pyspark.sql.streaming.DataStreamReader.parquet

DataStreamReader.parquet(path: str, mergeSchema: Optional[bool] = None, pathGlobFilter: Union[bool, str, None] = None, recursiveFileLookup: Union[bool, str, None] = None, datetimeRebaseMode: Union[bool, str, None] = None, int96RebaseMode: Union[bool, str, None] = None) → DataFrame[source]

Loads a Parquet file stream, returning the result as a DataFrame.

New in version 2.0.0.

Parameters
pathstr

the path in any Hadoop supported file system

Other Parameters
Extra options

For the extra options, refer to Data Source Option. in the version you use.

Examples

>>> parquet_sdf = spark.readStream.schema(sdf_schema).parquet(tempfile.mkdtemp())
>>> parquet_sdf.isStreaming
True
>>> parquet_sdf.schema == sdf_schema
True