I have a Sqlite database that contains following type of schema:
termcount(doc_num, term , count)
This table contains terms with their respective counts in the document.
like
(doc1 , term1 ,12)
(doc1, term 22, 2)
.
.
(docn,term1 , 10)
This matrix can be considered as sparse matrix as each documents contains very few terms that will have a non-zero value.
How would I create a dense matrix from this sparse matrix using numpy as I have to calculate the similarity among documents using cosine similarity.
This dense matrix will look like a table that have docid as the first column and all the terms will be listed as the first row.and remaining cells will contain counts.
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…