Package org.biojava.nbio.survival.data
Class WorkSheet
java.lang.Object
org.biojava.nbio.survival.data.WorkSheet
Need to handle very large spreadsheets of expression data so keep memory
footprint low
- Author:
- Scooter Willis
-
Constructor Summary
ConstructorsConstructorDescriptionWorkSheet(Collection<String> rows, Collection<String> columns) WorkSheet(CompactCharSequence[][] values) -
Method Summary
Modifier and TypeMethodDescriptionvoidAdd data to a cellvoidvoidaddColumns(ArrayList<String> columns, String defaultValue) Add columns to worksheet and set default valuevoidvoidAdd rows to the worksheet and fill in default valuevoidappendWorkSheetColumns(WorkSheet worksheet) Add columns from a second worksheet to be joined by common row.voidappendWorkSheetRows(WorkSheet worksheet) Add rows from a second worksheet to be joined by common column.voidapplyColumnFilter(String column, ChangeValue changeValue) Apply filter to a column to change values from say numberic to nominal based on some rangevoidchangeColumnHeader(String col, String newCol) voidchangeColumnHeader(ChangeValue changeValue) voidchangeColumnsHeaders(LinkedHashMap<String, String> newColumnValues) Change the columns in the HashMap Key to the name of the valuevoidchangeRowHeader(String row, String newRow) voidchangeRowHeader(ChangeValue changeValue) voidclear()See if we can free up memoryGet the list of column names including those that may be hiddenGet all rows including those that may be hiddenGet cell valuegetCellDouble(String row, String col) getColumnIndex(String column) Get the list of column names.static WorkSheetgetCopyWorkSheet(WorkSheet copyWorkSheet) Create a copy of a worksheet.static WorkSheetgetCopyWorkSheetSelectedRows(WorkSheet copyWorkSheet, ArrayList<String> rows) Create a copy of a worksheet.Get the list of row namesgetDiscreteColumnValues(String column) Get back a list of unique values in the columnGet back a list of unique values in the rowgetLogScale(double base) Get the log scale of this worksheet where a zero value will be set to .1 as Log(0) is undefinedgetLogScale(double base, double zeroValue) Get the log scale of this worksheetgetRandomDataColumns(int number) getRandomDataColumns(int number, ArrayList<String> columns) getRowIndex(String row) getRows()Get the list of row names.voidhideColumn(String column, boolean hide) voidvoidvoidhideMetaDataColumns(boolean value) voidhideMetaDataRows(boolean value) voidbooleanisMetaDataColumn(String column) booleanisMetaDataRow(String row) booleanisValidColumn(String col) booleanisValidRow(String row) voidmarkMetaDataColumn(String column) voidmarkMetaDataColumns(ArrayList<String> metaDataColumns) marks columns as containing meta datavoidmarkMetaDataRow(String row) voidrandomlyDivideSave(double percentage, String fileName1, String fileName2) Split a worksheet randomly.static WorkSheetstatic WorkSheetreadCSV(InputStream is, char delimiter) Read a CSV/Tab delimited file where you pass in the delimiterstatic WorkSheetRead a CSV/Tab delimitted file where you pass in the delimitervoidreplaceColumnValues(String column, HashMap<String, String> values) Change values in a column where 0 = something and 1 = something differentvoidsave(OutputStream outputStream, char delimitter, boolean quoteit) voidSave the worksheet as a csv filevoidvoidsetCacheDoubleValues(boolean value) voidsetIndexColumnName(String indexColumnName) voidsetMetaDataColumns(ArrayList<String> metaDataColumns) Clears existing meta data columns and sets new onesvoidvoidsetMetaDataColumnsAfterColumn(String column) voidsetMetaDataRows(ArrayList<String> metaDataRows) voidvoidvoidsetRowHeader(String value) voidshuffleColumnsAndThenRows(ArrayList<String> columns, ArrayList<String> rows) Randomly shuffle the columns and rows.voidshuffleColumnValues(ArrayList<String> columns) Need to shuffle column values to allow for randomized testing.voidshuffleRowValues(ArrayList<String> rows) Need to shuffle rows values to allow for randomized testing.Swap the row and columns returning a new worksheettoString()static WorkSheetunionWorkSheetsRowJoin(String w1FileName, String w2FileName, char delimitter, boolean secondSheetMetaData) Combine two work sheets where you join based on rows.static WorkSheetunionWorkSheetsRowJoin(WorkSheet w1, WorkSheet w2, boolean secondSheetMetaData) * Combine two work sheets where you join based on rows.
-
Constructor Details
-
Method Details
-
clear
See if we can free up memory -
toString
-
randomlyDivideSave
public void randomlyDivideSave(double percentage, String fileName1, String fileName2) throws Exception Split a worksheet randomly. Used for creating a discovery/validation data set The first file name will matched the percentage and the second file the remainder- Parameters:
percentage-fileName1-fileName2-- Throws:
Exception
-
getCopyWorkSheetSelectedRows
public static WorkSheet getCopyWorkSheetSelectedRows(WorkSheet copyWorkSheet, ArrayList<String> rows) throws Exception Create a copy of a worksheet. If shuffling of columns or row for testing a way to duplicate original worksheet- Parameters:
copyWorkSheet-rows-- Returns:
- Throws:
Exception
-
getCopyWorkSheet
Create a copy of a worksheet. If shuffling of columns or row for testing a way to duplicate original worksheet- Parameters:
copyWorkSheet-- Returns:
- Throws:
Exception
-
getMetaDataColumns
- Returns:
-
getMetaDataRows
- Returns:
-
getDataColumns
- Returns:
-
shuffleColumnsAndThenRows
public void shuffleColumnsAndThenRows(ArrayList<String> columns, ArrayList<String> rows) throws Exception Randomly shuffle the columns and rows. Should be constrained to the same data type if not probably doesn't make any sense.- Parameters:
columns-rows-- Throws:
Exception
-
shuffleColumnValues
Need to shuffle column values to allow for randomized testing. The columns in the list will be shuffled together- Parameters:
columns-- Throws:
Exception
-
shuffleRowValues
Need to shuffle rows values to allow for randomized testing. The rows in the list will be shuffled together- Parameters:
rows-- Throws:
Exception
-
hideMetaDataColumns
- Parameters:
value-
-
hideMetaDataRows
- Parameters:
value-
-
setMetaDataRowsAfterRow
-
setMetaDataColumnsAfterColumn
-
setMetaDataRowsAfterRow
- Parameters:
row-
-
setMetaDataColumnsAfterColumn
- Parameters:
column-
-
setMetaDataColumns
Clears existing meta data columns and sets new ones- Parameters:
metaDataColumns-
-
markMetaDataColumns
marks columns as containing meta data- Parameters:
metaDataColumns-
-
markMetaDataColumn
- Parameters:
column-
-
isMetaDataColumn
- Parameters:
column-- Returns:
-
isMetaDataRow
- Parameters:
row-- Returns:
-
markMetaDataRow
- Parameters:
row-
-
setMetaDataRows
- Parameters:
metaDataRows-
-
hideEmptyRows
- Throws:
Exception
-
hideEmptyColumns
- Throws:
Exception
-
hideRow
- Parameters:
row-hide-
-
hideColumn
- Parameters:
column-hide-
-
replaceColumnValues
Change values in a column where 0 = something and 1 = something different- Parameters:
column-values-- Throws:
Exception
-
applyColumnFilter
Apply filter to a column to change values from say numberic to nominal based on some range- Parameters:
column-changeValue-- Throws:
Exception
-
addColumn
- Parameters:
column-defaultValue-
-
addColumns
Add columns to worksheet and set default value- Parameters:
columns-defaultValue-
-
addRow
- Parameters:
row-defaultValue-
-
addRows
Add rows to the worksheet and fill in default value- Parameters:
rows-defaultValue-
-
addCell
Add data to a cell- Parameters:
row-col-value-- Throws:
Exception
-
isValidRow
- Parameters:
row-- Returns:
-
isValidColumn
- Parameters:
col-- Returns:
-
setCacheDoubleValues
- Parameters:
value-
-
getCellDouble
- Parameters:
row-col-- Returns:
- Throws:
Exception
-
getCell
Get cell value- Parameters:
row-col-- Returns:
- Throws:
Exception
-
changeRowHeader
- Parameters:
changeValue-
-
changeColumnHeader
- Parameters:
changeValue-
-
changeRowHeader
- Parameters:
row-newRow-- Throws:
Exception
-
changeColumnsHeaders
Change the columns in the HashMap Key to the name of the value- Parameters:
newColumnValues-- Throws:
Exception
-
changeColumnHeader
- Parameters:
col-newCol-- Throws:
Exception
-
getColumnIndex
- Parameters:
column-- Returns:
- Throws:
Exception
-
getRowIndex
- Parameters:
row-- Returns:
- Throws:
Exception
-
getRandomDataColumns
- Parameters:
number-- Returns:
-
getRandomDataColumns
- Parameters:
number-columns-- Returns:
-
getAllColumns
Get the list of column names including those that may be hidden- Returns:
-
getColumns
Get the list of column names. Does not include hidden columns- Returns:
-
getDiscreteColumnValues
Get back a list of unique values in the column- Parameters:
column-- Returns:
- Throws:
Exception
-
getDiscreteRowValues
Get back a list of unique values in the row- Parameters:
row-- Returns:
- Throws:
Exception
-
getAllRows
Get all rows including those that may be hidden- Returns:
-
getRows
Get the list of row names. Will exclude hidden values- Returns:
-
getDataRows
Get the list of row names- Returns:
-
getLogScale
Get the log scale of this worksheet where a zero value will be set to .1 as Log(0) is undefined- Parameters:
base-- Returns:
- Throws:
Exception
-
getLogScale
Get the log scale of this worksheet- Parameters:
base-- Returns:
- Throws:
Exception
-
swapRowAndColumns
Swap the row and columns returning a new worksheet- Returns:
- Throws:
Exception
-
unionWorkSheetsRowJoin
public static WorkSheet unionWorkSheetsRowJoin(String w1FileName, String w2FileName, char delimitter, boolean secondSheetMetaData) throws Exception Combine two work sheets where you join based on rows. Rows that are found in one but not the other are removed. If the second sheet is meta data then a meta data column will be added between the two joined columns- Parameters:
w1FileName-w2FileName-delimitter-secondSheetMetaData-- Returns:
- Throws:
Exception
-
unionWorkSheetsRowJoin
public static WorkSheet unionWorkSheetsRowJoin(WorkSheet w1, WorkSheet w2, boolean secondSheetMetaData) throws Exception * Combine two work sheets where you join based on rows. Rows that are found in one but not the other are removed. If the second sheet is meta data then a meta data column will be added between the two joined columns- Parameters:
w1-w2-secondSheetMetaData-- Returns:
- Throws:
Exception
-
readCSV
Read a CSV/Tab delimitted file where you pass in the delimiter- Parameters:
fileName-delimiter-- Returns:
- Throws:
Exception
-
readCSV
- Throws:
Exception
-
readCSV
Read a CSV/Tab delimited file where you pass in the delimiter- Parameters:
is-delimiter-- Returns:
- Throws:
Exception
-
saveCSV
Save the worksheet as a csv file- Parameters:
fileName-- Throws:
Exception
-
saveTXT
- Parameters:
fileName-- Throws:
Exception
-
setRowHeader
- Parameters:
value-
-
appendWorkSheetColumns
Add columns from a second worksheet to be joined by common row. If the appended worksheet doesn't contain a row in the master worksheet then default value of "" is used. Rows in the appended worksheet not found in the master worksheet are not added.- Parameters:
worksheet-- Throws:
Exception
-
appendWorkSheetRows
Add rows from a second worksheet to be joined by common column. If the appended worksheet doesn't contain a column in the master worksheet then default value of "" is used. Columns in the appended worksheet not found in the master worksheet are not added.- Parameters:
worksheet-- Throws:
Exception
-
save
- Parameters:
outputStream-delimitter-quoteit-- Throws:
Exception
-
getIndexColumnName
- Returns:
- the indexColumnName
-
setIndexColumnName
- Parameters:
indexColumnName- the indexColumnName to set
-
getColumnLookup
- Returns:
- the columnLookup
-
getRowLookup
- Returns:
- the rowLookup
-
getMetaDataColumnsHashMap
- Returns:
- the metaDataColumnsHashMap
-
getMetaDataRowsHashMap
- Returns:
- the metaDataRowsHashMap
-
getRowHeader
- Returns:
- the rowHeader
-