FlowCal.excel_ui module

FlowCal’s Microsoft Excel User Interface.

This module contains functions to read, gate, and transform data from a set of FCS files, as specified by an input Microsoft Excel file. This file should contain the following tables:

  • Instruments: Describes the instruments used to acquire the samples listed in the other tables. Each instrument is specified by a row containing at least the following fields:
    • ID: Short string identifying the instrument. Will be referenced by samples in the other tables.
    • Forward Scatter Channel: Name of the forward scatter channel, as specified by the $PnN keyword in the associated FCS files.
    • Side Scatter Channel: Name of the side scatter channel, as specified by the $PnN keyword in the associated FCS files.
    • Fluorescence Channels: Name of the fluorescence channels in a comma-separated list, as specified by the $PnN keyword in the associated FCS files.
    • Time Channel: Name of the time channel, as specified by the $PnN keyword in the associated FCS files.
  • Beads: Describes the calibration beads samples that will be used to calibrate cell samples in the Samples table. The following information should be available for each beads sample:
    • ID: Short string identifying the beads sample. Will be referenced by cell samples in the Samples table.
    • Instrument ID: ID of the instrument used to acquire the sample. Must match one of the rows in the Instruments table.
    • File Path: Path of the FCS file containing the sample’s data.
    • <Fluorescence Channel Name> MEF Values: The fluorescence in MEF of each bead subpopulation, as given by the manufacturer, as a comma-separated list of numbers. Any element of this list can be replaced with the word None, in which case the corresponding subpopulation will not be used when fitting the beads fluorescence model. Note that the number of elements in this list (including the elements equal to None) are the number of subpopulations that FlowCal will try to find.
    • Gate fraction: The fraction of events to keep from the sample after density-gating in the forward/side scatter channels.
    • Clustering Channels: The fluorescence channels used to identify the different bead subpopulations.
  • Samples: Describes the biological samples to be processed. The following information should be available for each sample:
    • ID: Short string identifying the sample. Will be used as part of the plot’s filenames and in the Histograms table in the output Excel file.
    • Instrument ID: ID of the instrument used to acquire the sample. Must match one of the rows in the Instruments table.
    • Beads ID: ID of the beads sample used to convert data to calibrated MEF.
    • File Path: Path of the FCS file containing the sample’s data.
    • <Fluorescence Channel Name> Units: Units to which the event list in the specified fluorescence channel should be converted, and all the subsequent plots and statistics should be reported. Should be one of the following: “Channel” (raw units), “a.u.” or “RFI” (arbitrary units) or “MEF” (calibrated Molecules of Equivalent Fluorophore). If “MEF” is specified, the Beads ID should be populated, and should correspond to a beads sample with the MEF Values specified for the same channel.
    • Gate fraction: The fraction of events to keep from the sample after density-gating in the forward/side scatter channels.

Any columns other than the ones specified above can be present, but will be ignored by FlowCal.

exception FlowCal.excel_ui.ExcelUIException

Bases: Exception

FlowCal Excel UI Error.

FlowCal.excel_ui.add_beads_stats(beads_table, beads_samples, mef_outputs=None)

Add stats fields to beads table.

The following information is added to each row:

  • Notes (warnings, errors) resulting from the analysis
  • Number of Events
  • Acquisition Time (s)

The following information is added for each row, for each channel in which MEF values have been specified:

  • Detector voltage (gain)
  • Amplification type
  • Bead model fitted parameters
Parameters:
beads_table : DataFrame

Table specifying bead samples to analyze. For more information about the fields required in this table, please consult the module’s documentation.

beads_samples : dict or OrderedDict

FCSData objects from which to calculate statistics. beads_samples[id] should correspond to beads_table.loc[id,:].

mef_outputs : dict or OrderedDict, optional

Intermediate results from the generation of the MEF transformation functions, as given by mef.get_transform_fxn(). This is used to populate the fields <channel> Beads Model, <channel> Beads Params. Names, and <channel> Beads Params. Values. If specified, mef_outputs[id] should correspond to beads_table.loc[id,:].

FlowCal.excel_ui.add_samples_stats(samples_table, samples)

Add stats fields to samples table.

The following information is added to each row:

  • Notes (warnings, errors) resulting from the analysis
  • Number of Events
  • Acquisition Time (s)

The following information is added for each row, for each channel in which fluorescence units have been specified:

  • Detector voltage (gain)
  • Amplification type
  • Mean
  • Geometric Mean
  • Median
  • Mode
  • Standard Deviation
  • Coefficient of Variation (CV)
  • Geometric Standard Deviation
  • Geometric Coefficient of Variation
  • Inter-Quartile Range
  • Robust Coefficient of Variation (RCV)
Parameters:
samples_table : DataFrame

Table specifying samples to analyze. For more information about the fields required in this table, please consult the module’s documentation.

samples : dict or OrderedDict

FCSData objects from which to calculate statistics. samples[id] should correspond to samples_table.loc[id,:].

Notes

Geometric statistics (geometric mean, standard deviation, and geometric coefficient of variation) are defined only for positive data. If there are negative events in any relevant channel of any member of samples, geometric statistics will only be calculated on the positive events, and a warning message will be written to the “Analysis Notes” field.

FlowCal.excel_ui.generate_about_table(extra_info={})

Make a table with information about FlowCal and the current analysis.

Parameters:
extra_info : dict, optional

Additional keyword:value pairs to include in the table.

Returns:
about_table : DataFrame

Table with information about FlowCal and the current analysis, as keyword:value pairs. The following keywords are included: FlowCal version, and date and time of analysis. Keywords and values from extra_info are also included.

FlowCal.excel_ui.generate_histograms_table(samples_table, samples, max_bins=1024)

Generate a table of histograms as a DataFrame.

Parameters:
samples_table : DataFrame

Table specifying samples to analyze. For more information about the fields required in this table, please consult the module’s documentation.

samples : dict or OrderedDict

FCSData objects from which to calculate statistics. samples[id] should correspond to samples_table.loc[id,:].

max_bins : int, optional

Maximum number of bins to use.

Returns:
hist_table : DataFrame

A multi-indexed DataFrame. Rows contain the histogram bins and counts for every sample and channel specified in samples_table. hist_table is indexed by the sample’s ID, the channel name, and whether the row corresponds to bins or counts.

FlowCal.excel_ui.process_beads_table(beads_table, instruments_table, base_dir='.', verbose=False, plot=False, plot_dir=None, full_output=False, get_transform_fxn_kwargs={})

Process calibration bead samples, as specified by an input table.

This function processes the entries in beads_table. For each row, the function does the following:

  • Load the FCS file specified in the field “File Path”.
  • Transform the forward scatter/side scatter and fluorescence channels to RFI
  • Remove the 250 first and 100 last events.
  • Remove saturated events in the forward scatter and side scatter channels.
  • Apply density gating on the forward scatter/side scatter channels.
  • Generate a standard curve transformation function, for each fluorescence channel in which the associated MEF values are specified.
  • Generate forward/side scatter density plots and fluorescence histograms, and plots of the clustering and fitting steps of standard curve generation, if plot = True.

Names of forward/side scatter and fluorescence channels are taken from instruments_table.

Parameters:
beads_table : DataFrame

Table specifying beads samples to be processed. For more information about the fields required in this table, please consult the module’s documentation.

instruments_table : DataFrame

Table specifying instruments. For more information about the fields required in this table, please consult the module’s documentation.

base_dir : str, optional

Directory from where all the other paths are specified.

verbose : bool, optional

Whether to print information messages during the execution of this function.

plot : bool, optional

Whether to generate and save density/histogram plots of each sample, and each beads sample.

plot_dir : str, optional

Directory relative to base_dir into which plots are saved. If plot is False, this parameter is ignored. If plot==True and plot_dir is None, plot without saving.

full_output : bool, optional

Flag indicating whether to include an additional output, containing intermediate results from the generation of the MEF transformation functions.

get_transform_fxn_kwargs : dict, optional

Additional parameters passed directly to internal mef.get_transform_fxn() function call.

Returns:
beads_samples : OrderedDict

Processed, gated, and transformed samples, indexed by beads_table.index.

mef_transform_fxns : OrderedDict

MEF transformation functions, indexed by beads_table.index.

mef_outputs : OrderedDict, only if full_output==True

Intermediate results from the generation of the MEF transformation functions. For every entry in beads_table, FlowCal.mef.get_transform_fxn() is called on the corresponding processed and gated beads sample with full_output=True, and the full output (a MEFOutput namedtuple) is added to mef_outputs. mef_outputs is indexed by beads_table.index. Refer to the documentation for FlowCal.mef.get_transform_fxn() for more information.

FlowCal.excel_ui.process_samples_table(samples_table, instruments_table, mef_transform_fxns=None, beads_table=None, base_dir='.', verbose=False, plot=False, plot_dir=None)

Process flow cytometry samples, as specified by an input table.

The function processes each entry in samples_table, and does the following:

  • Load the FCS file specified in the field “File Path”.
  • Transform the forward scatter/side scatter to RFI.
  • Transform the fluorescence channels to the units specified in the column “<Channel name> Units”.
  • Remove the 250 first and 100 last events.
  • Remove saturated events in the forward scatter and side scatter channels.
  • Apply density gating on the forward scatter/side scatter channels.
  • Plot combined forward/side scatter density plots and fluorescence historgrams, if plot = True.

Names of forward/side scatter and fluorescence channels are taken from instruments_table.

Parameters:
samples_table : DataFrame

Table specifying samples to be processed. For more information about the fields required in this table, please consult the module’s documentation.

instruments_table : DataFrame

Table specifying instruments. For more information about the fields required in this table, please consult the module’s documentation.

mef_transform_fxns : dict or OrderedDict, optional

Dictionary containing MEF transformation functions. If any entry in samples_table requires transformation to MEF, a key: value pair must exist in mef_transform_fxns, with the key being equal to the contents of field “Beads ID”.

beads_table : DataFrame, optional

Table specifying beads samples used to generate mef_transform_fxns. This is used to check if a beads sample was taken at the same acquisition settings as a sample to be transformed to MEF. For any beads sample and channel for which a MEF transformation function has been generated, the following fields should be populated: <channel> Amp. Type and <channel> Detector Volt. If beads_table is not specified, no checking will be performed.

base_dir : str, optional

Directory from where all the other paths are specified.

verbose : bool, optional

Whether to print information messages during the execution of this function.

plot : bool, optional

Whether to generate and save density/histogram plots of each sample, and each beads sample.

plot_dir : str, optional

Directory relative to base_dir into which plots are saved. If plot is False, this parameter is ignored. If plot==True and plot_dir is None, plot without saving.

Returns:
samples : OrderedDict

Processed, gated, and transformed samples, indexed by samples_table.index.

FlowCal.excel_ui.read_table(filename, sheetname, index_col=None, engine=None)

Return the contents of an Excel table as a pandas DataFrame.

Parameters:
filename : str

Name of the Excel file to read.

sheetname : str or int

Name or index of the sheet inside the Excel file to read.

index_col : str, optional

Column name or index to be used as row labels of the DataFrame. If None, default index will be used.

engine : str, optional

Engine used by pd.read_excel() to read Excel file. If None, try ‘openpyxl’ then ‘xlrd’.

Returns:
table : DataFrame

A DataFrame containing the data in the specified Excel table. If index_col is not None, rows in which their index_col field is empty will not be present in table.

Raises:
ValueError

If index_col is specified and two rows contain the same index_col field.

FlowCal.excel_ui.run(input_path=None, output_path=None, verbose=True, plot=True, hist_sheet=False)

Run the MS Excel User Interface.

This function performs the following:

  1. If input_path is not specified, show a dialog to choose an input Excel file.
  2. Extract data from the Instruments, Beads, and Samples tables.
  3. Process all the bead samples specified in the Beads table.
  4. Generate statistics for each bead sample.
  5. Process all the cell samples in the Samples table.
  6. Generate statistics for each sample.
  7. If requested, generate a histogram table for each fluorescent channel specified for each sample.
  8. Generate a table with run time, date, FlowCal version, among others.
  9. Save statistics and (if requested) histograms in an output Excel file.
Parameters:
input_path : str

Path to the Excel file to use as input. If None, show a dialog to select an input file.

output_path : str

Path to which to save the output Excel file. If None, use “<input_path>_output”.

verbose : bool, optional

Whether to print information messages during the execution of this function.

plot : bool, optional

Whether to generate and save density/histogram plots of each sample, and each beads sample.

hist_sheet : bool, optional

Whether to generate a sheet in the output Excel file specifying histogram bin information.

FlowCal.excel_ui.run_command_line(args=None)

Entry point for the FlowCal and flowcal console scripts.

Parameters:
args: list of strings, optional

Command line arguments. If None or not specified, get arguments from sys.argv.

References

http://amir.rachum.com/blog/2017/07/28/python-entry-points/

FlowCal.excel_ui.show_open_file_dialog(filetypes)

Show an open file dialog and return the path of the file selected.

Parameters:
filetypes : list of tuples

Types of file to show on the dialog. Each tuple on the list must have two elements associated with a filetype: the first element is a description, and the second is the associated extension.

Returns:
filename : str

The path of the filename selected, or an empty string if no file was chosen.

FlowCal.excel_ui.write_workbook(filename, table_list, column_width=None)

Write an Excel workbook from a list of tables.

Parameters:
filename : str

Name of the Excel file to write.

table_list : list of (str, DataFrame) tuples

Tables to be saved as individual sheets in the Excel table. Each tuple contains two values: the name of the sheet to be saved as a string, and the contents of the table as a DataFrame.

column_width: int or float, optional

The column width to use when saving the spreadsheet. If None, calculate width automatically from the maximum number of characters in each column.