Data and Analysis ================= PIEC couples data persistence and post-processing directly into the measurement workflow. When ``run_experiment()`` is called on any measurement object, the sequence is: 1. **Capture** — raw instrument data is stored in ``self.data`` (a ``pandas.DataFrame``). 2. **Save** — ``save_waveform()`` writes metadata and data to a single CSV file. 3. **Analyze** — ``analyze()`` re-opens that CSV, appends derived columns (current, polarization, applied voltage, etc.), optionally generates plots, and overwrites the file with the enriched dataset. Because analysis is a method on the measurement class, the exact post-processing that runs is determined by the measurement type (``HysteresisLoop``, ``ThreePulsePund``, etc.). Each subclass overrides ``analyze()`` to call the appropriate function from ``piec.analysis``. CSV file format --------------- Every saved file uses the same two-section layout: .. code-block:: text metadata_col_1,metadata_col_2,...,mtype,timestamp,processed ← row 0 (header) value_1,value_2,...,hysteresis,1714934000.0,True ← row 1 (values) ← row 2 (blank) time (s),voltage (V),current (A),polarization (uC/cm^2),... ← row 3 (data header) 0.0,0.00123,... ← row 4+ (data) **Row 0–1** — a single-row metadata table whose columns capture every measurement parameter (amplitude, frequency, area, instrument IDs, ``mtype``, ``timestamp``, ``processed`` flag, etc.). **Row 3+** — the data table. Raw columns (``time (s)``, ``voltage (V)``) are written at capture time; derived columns (``current (A)``, ``polarization (uC/cm^2)``, ``applied voltage (V)``) are appended by the analysis step, which then sets ``processed = True`` in the metadata row. File naming ----------- Filenames are generated by ``create_measurement_filename()`` and follow the pattern: .. code-block:: text {index}_{mtype}_{notes}.csv For example: ``0_hysteresis_1p0V_1000Hz.csv``, ``1_3pulsepund_2p0Vres_0p5Vpu.csv``. The index auto-increments to avoid overwriting existing files in the same directory. Reading saved data ------------------ Use ``standard_csv_to_metadata_and_data()`` to reload any PIEC CSV: .. code-block:: python from piec.analysis.utilities import standard_csv_to_metadata_and_data metadata, data = standard_csv_to_metadata_and_data('0_hysteresis_1p0V_1000Hz.csv') # metadata is a 1-row DataFrame with all measurement parameters print(metadata['amplitude'].values[0]) # 1.0 print(metadata['mtype'].values[0]) # 'hysteresis' # data is the full measurement DataFrame data.plot(x='applied voltage (V)', y='polarization (uC/cm^2)') Utility functions reference --------------------------- The ``piec.analysis.utilities`` module provides the file-handling and waveform helpers used throughout the package. ``metadata_and_data_to_csv(metadata, data, path)`` Takes two DataFrames — a 1×N metadata table and a data table — and writes them to a single CSV, separated by a blank line. Used by every measurement class in ``save_waveform()``. ``standard_csv_to_metadata_and_data(path, metadata_header_row=0, data_header_row=2)`` Inverse of the above. Returns ``(metadata, data)`` as two separate DataFrames by reading the metadata from row 0 (1 data row) and the data from row 2 onward. ``create_measurement_filename(directory, measurement_type, notes="", type="csv")`` Generates a unique filepath of the form ``{index}_{measurement_type}_{notes}.csv``. Auto-increments the index to avoid collisions and creates the target directory if it does not exist. ``interpolate_sparse_to_dense(x_sparse, y_sparse, total_points=100)`` Linearly interpolates sparse (x, y) coordinate pairs into a dense y-array of length ``total_points``. Used internally to reconstruct applied-voltage waveforms from their piecewise-linear definitions during analysis. Example: hysteresis analysis walkthrough ---------------------------------------- The ``process_raw_hyst`` function (``piec.analysis.hysteresis``) is called automatically by ``HysteresisLoop.analyze()``. Here is what it does step by step: 1. **Load** — reads the CSV back into ``metadata`` and ``raw_df`` using ``standard_csv_to_metadata_and_data()``. 2. **Current** — converts the oscilloscope voltage to current via the 50 Ω sense resistor and subtracts the DC offset (mean of the first 20 points, which correspond to the quiet baseline prepended by the AWG): .. code-block:: python df['current (A)'] = df['voltage (V)'] / 50 df['current (A)'] -= np.mean(df['current (A)'].values[:20]) 3. **Polarization** — integrates the current over time with ``scipy.integrate.cumulative_trapezoid``, converting from C/m² to µC/cm²: .. code-block:: python df['polarization (uC/cm^2)'] = cumulative_trapezoid( df['current (A)'] / area * 100, df['time (s)'], initial=0) 4. **Time alignment** — if ``auto_timeshift=True``, the code assumes the first polarization maximum coincides with the first applied-voltage maximum and computes the time offset accordingly. For leaky samples this heuristic can fail, in which case a manual ``time_offset`` should be supplied. 5. **Applied voltage reconstruction** — a piecewise-linear triangle wave is generated from the metadata parameters (``amplitude``, ``frequency``, ``n_cycles``) using ``interpolate_sparse_to_dense()`` and aligned to the data using the time offset. 6. **Plots** — if requested, three figures are generated: the P–V hysteresis loop, the I–V loop, and a dual-axis time trace of polarization and applied voltage. 7. **Save** — the enriched DataFrame (now including ``current (A)``, ``polarization (uC/cm^2)``, and ``applied voltage (V)`` columns) is written back to the same CSV, and the metadata ``processed`` flag is set to ``True``.