Tf-idf(全名Term Frequency-Inverse Document Frequency)喺資訊提取上泛指一啲反映隻字「喺份文件入面有幾重要」嘅數值。
一隻字嘅term frequency 係隻字喺份文件入面出現咗幾多次除以份文件嘅總字數。
- Salton, G., & Buckley, C. (1988). Term-weighting approaches in automatic text retrieval.Information processing & management, 24(5), 513-523.