Data Quality

data_summary(data[, col, ...])

Output the general information of the dataset.

sample_duration(data[, col])

Calculate the data sampling interval.

transbigdata.data_summary(data, col=['Vehicleid', 'Time'], show_sample_duration=False, roundnum=4)

Output the general information of the dataset.

Parameters:
  • data (DataFrame) – The trajectory points data

  • col (List) – The column name, in the order of [‘Vehicleid’, ‘Time’]

  • show_sample_duration (bool) – Whether to output individual sampling interval

  • roundnum (number) – Number of decimal places

transbigdata.sample_duration(data, col=['Vehicleid', 'Time'])

Calculate the data sampling interval.

Parameters:
  • data (DataFrame) – Data

  • col (List) – The column name, in the order of [‘Vehicleid’, ‘Time’]

Returns:

sample_duration – A Series with the column name duration, the content is the sampling interval of the data, in seconds

Return type:

DataFrame