Logo

v

Home Documentation Contact

 

 

Q1: Why should I use FITs ?

Answer:FITs is meant to impute very sparse and noisy read-count matrices of single-cell open-chromatin profiles. Traditional imputation methods are either not able to handle so sparse matrices or they cannot recover signal of minor cell types in imbalanced data-sets. However FITs can handle highly sparse read-count matrices as well as imbalanced data-sets to recover signal of minor cells.

 

Q2: I have large read-count matrix, how can I use FITs?

Answer :There are two functions included in FITs which can be used with large count matrices. Those functions are FITSPhase1L and FITSPhase2L. See documentation for details of using them. Notice that FITSPhase1L does not need large RAM memory however it is slow as it divides that data in to over-lapping chunks and imputes them serially. Hence you can run multiple processes of FITSPhase1L at the same time.

 

Q3: What is the format of input read-count matrix provided to FITS?

 Answer : The input read-count file has rows as genomic locations and columns as cells. The read-counts values are separated by comma (,). If you provide some other separator (delimiter) like tab or space, FITS may not be able to read it. Nevertheless we are trying to make it work with other kinds of separators also.

 

Q4:I am not able to use matlab or python versions. What should I do ?

Answer : If you are using linux, we have already provided pre-compiled executable for linux. Though this is big file, but it removes the headache of dependencies etc.

 

Q5: Can I try FITs for other single-cell profiles like expression, DNAse-seq, MNAse-seq etc?

Answer : Yes you can apply FITs on other single-cell profiles. Especially for highly sparse and imbalanced read-count data FITs may proove to more usefull

 

Q6: If I can tranform read-count to TF motif score using scATAC-seq profile for clustering, why should I use read-counts on peaks for classification ?

Answer : There is no doubt that transforming to TF-motif score reduces the size of scATAC-seq profile and probably sparsity also. However it may not always work, as due to very high sparsity you could get very low and noisy scores for cell-type-specific motifs for every cell. In that case, imputation could help even in adding more information to TF-motif score also. Moreover, just using read-counts on peaks directly for classification provides you more options for feature selection.