|
Model result (NMI) |
Statistical significance of MfTM |
K |
LDA |
T-LDA |
TOT |
DLDA |
MfTM |
DLDA 50 |
DLDA 100 |
DLDA 150 |
DLDA 200 |
LDA 50 |
LDA 100 |
LDA 150 |
LDA 200 |
TOT 50 |
TOT 100 |
TOT 150 |
TOT 200 |
T-LDA 50 |
T-LDA 100 |
T-LDA 150 |
T-LDA 200 |
50 |
0.479 |
0.274 |
0.451 |
0.268 |
0.555 |
*** |
*** |
*** |
*** |
*- |
* |
|
|
*** |
** |
*- |
|
*** |
*** |
*** |
*** |
100 |
0.498 |
0.397 |
0.498 |
0.362 |
0.585 |
*** |
*** |
*** |
*** |
** |
*- |
* |
* |
*** |
*** |
*** |
** |
*** |
*** |
*** |
*** |
150 |
0.526 |
0.411 |
0.503 |
0.379 |
0.598 |
*** |
*** |
*** |
*** |
** |
** |
*- |
*- |
*** |
*** |
*** |
*** |
*** |
*** |
*** |
*** |
200 |
0.522 |
0.432 |
0.530 |
0.397 |
0.618 |
*** |
*** |
*** |
*** |
*** |
** |
*- |
** |
*** |
*** |
*** |
*** |
*** |
*** |
*** |
*** |
Dataset ML
|
Model result (NMI) |
Statistical significance of MfTM |
K |
TFIDF |
LDA |
T-LDA |
TOT |
DLDA |
MfTM |
TFIDF |
DLDA 50 |
DLDA 100 |
DLDA 150 |
DLDA 200 |
LDA 50 |
LDA 100 |
LDA 150 |
LDA 200 |
TOT 50 |
TOT 100 |
TOT 150 |
TOT 200 |
T-LDA 50 |
T-LDA 100 |
T-LDA 150 |
T-LDA 200 |
50 |
0.312 |
0.369 |
0.358 |
0.375 |
0.398 |
0.421 |
*** |
* |
** |
*- |
** |
*** |
*** |
*** |
*** |
*** |
*** |
*** |
*** |
*** |
*** |
*** |
*** |
100 |
0.312 |
0.345 |
0.356 |
0.362 |
0.397 |
0.415 |
*** |
|
*- |
* |
** |
*** |
*** |
*** |
*** |
** |
*** |
*** |
*** |
*** |
*** |
*** |
*** |
150 |
0.312 |
0.340 |
0.348 |
0.353 |
0.398 |
0.422 |
*** |
*- |
** |
*- |
** |
*** |
*** |
*** |
*** |
*** |
*** |
*** |
*** |
*** |
*** |
*** |
*** |
200 |
0.312 |
0.324 |
0.352 |
0.339 |
0.383 |
0.421 |
*** |
* |
*- |
*- |
** |
*** |
*** |
*** |
*** |
*** |
*** |
*** |
*** |
*** |
*** |
*** |
*** |
Dataset HL
|
Model result (NMI) |
Statistical significance of MfTM |
K |
TFIDF |
LDA |
T-LDA |
TOT |
DLDA |
MfTM |
TFIDF |
DLDA 50 |
DLDA 100 |
DLDA 150 |
DLDA 200 |
LDA 50 |
LDA 100 |
LDA 150 |
LDA 200 |
TOT 50 |
TOT 100 |
TOT 150 |
TOT 200 |
T-LDA 50 |
T-LDA 100 |
T-LDA 150 |
T-LDA 200 |
50 |
0.393 |
0.463 |
0.268 |
0.469 |
0.355 |
0.573 |
*** |
*** |
*** |
*** |
*** |
** |
** |
*** |
*** |
*** |
*** |
*** |
*** |
*** |
*** |
*** |
*** |
100 |
0.393 |
0.449 |
0.305 |
0.448 |
0.375 |
0.576 |
*** |
*** |
*** |
*** |
*** |
** |
** |
** |
*** |
*** |
*** |
*** |
*** |
*** |
*** |
*** |
*** |
150 |
0.393 |
0.448 |
0.299 |
0.461 |
0.379 |
0.563 |
*** |
*** |
*** |
*** |
*** |
** |
** |
** |
** |
*** |
*** |
*** |
*** |
*** |
*** |
*** |
*** |
200 |
0.393 |
0.427 |
0.307 |
0.444 |
0.379 |
0.586 |
*** |
*** |
*** |
*** |
*** |
** |
** |
*** |
*** |
*** |
*** |
*** |
*** |
*** |
*** |
*** |
*** |
Dataset ML
|
Model result (NMI) |
Statistical significance of MfTM |
K |
TFIDF |
LDA |
T-LDA |
TOT |
DLDA |
MfTM |
TFIDF |
DLDA 50 |
DLDA 100 |
DLDA 150 |
DLDA 200 |
LDA 50 |
LDA 100 |
LDA 150 |
LDA 200 |
TOT 50 |
TOT 100 |
TOT 150 |
TOT 200 |
T-LDA 50 |
T-LDA 100 |
T-LDA 150 |
T-LDA 200 |
50 |
0.487 |
0.338 |
0.390 |
0.338 |
0.237 |
0.520 |
|
*** |
*** |
*** |
*** |
*** |
*** |
*- |
|
*** |
*** |
*** |
*** |
*** |
** |
* |
*- |
100 |
0.487 |
0.389 |
0.420 |
0.293 |
0.308 |
0.571 |
*- |
*** |
*** |
*** |
*** |
*** |
*** |
*** |
** |
*** |
*** |
*** |
*** |
*** |
*** |
** |
** |
150 |
0.487 |
0.469 |
0.463 |
0.266 |
0.356 |
0.552 |
*- |
*** |
*** |
*** |
*** |
*** |
*** |
** |
*- |
*** |
*** |
*** |
*** |
*** |
*** |
** |
** |
200 |
0.487 |
0.499 |
0.460 |
0.256 |
0.393 |
0.515 |
|
*** |
*** |
*** |
** |
*** |
** |
|
|
*** |
*** |
*** |
*** |
** |
*- |
|
|
Dataset HL
|
Model result (NMI) |
Statistical significance of MfTM |
K |
TFIDF |
LDA |
T-LDA |
TOT |
DLDA |
MfTM |
TFIDF |
DLDA 50 |
DLDA 100 |
DLDA 150 |
DLDA 200 |
LDA 50 |
LDA 100 |
LDA 150 |
LDA 200 |
TOT 50 |
TOT 100 |
TOT 150 |
TOT 200 |
T-LDA 50 |
T-LDA 100 |
T-LDA 150 |
T-LDA 200 |
50 |
0.689 |
0.504 |
0.317 |
0.473 |
0.281 |
0.737 |
|
*** |
*** |
*** |
*** |
*** |
*** |
*** |
*** |
*** |
*** |
*** |
*** |
*** |
*** |
*** |
*** |
100 |
0.689 |
0.543 |
0.453 |
0.488 |
0.398 |
0.776 |
*- |
*** |
*** |
*** |
*** |
*** |
*** |
*** |
*** |
*** |
*** |
*** |
*** |
*** |
*** |
*** |
*** |
150 |
0.689 |
0.563 |
0.462 |
0.460 |
0.438 |
0.762 |
* |
*** |
*** |
*** |
*** |
*** |
*** |
*** |
*** |
*** |
*** |
*** |
*** |
*** |
*** |
*** |
*** |
200 |
0.689 |
0.562 |
0.482 |
0.475 |
0.478 |
0.788 |
*- |
*** |
*** |
*** |
*** |
*** |
*** |
*** |
*** |
*** |
*** |
*** |
*** |
*** |
*** |
*** |
*** |
Dataset ML
|
Model result (NMI) |
Statistical significance of MfTM |
K |
TFIDF |
LDA |
T-LDA |
TOT |
DLDA |
MfTM |
TFIDF |
DLDA 50 |
DLDA 100 |
DLDA 150 |
DLDA 200 |
LDA 50 |
LDA 100 |
LDA 150 |
LDA 200 |
TOT 50 |
TOT 100 |
TOT 150 |
TOT 200 |
T-LDA 50 |
T-LDA 100 |
T-LDA 150 |
T-LDA 200 |
50 |
0.398 |
0.283 |
0.348 |
0.386 |
0.333 |
0.403 |
|
* |
|
|
** |
** |
|
|
|
|
|
|
|
|
|
|
|
100 |
0.398 |
0.387 |
0.358 |
0.369 |
0.336 |
0.452 |
* |
*** |
** |
*** |
*** |
*** |
*- |
** |
** |
*- |
** |
*- |
*- |
*** |
*- |
** |
*- |
150 |
0.398 |
0.369 |
0.362 |
0.372 |
0.334 |
0.445 |
|
** |
** |
*** |
*** |
** |
*- |
** |
*- |
* |
*- |
*- |
*- |
** |
*- |
*- |
* |
200 |
0.398 |
0.368 |
0.361 |
0.371 |
0.296 |
0.456 |
*- |
*** |
*** |
*** |
*** |
*** |
** |
*** |
** |
*- |
** |
** |
*- |
*** |
*- |
** |
*- |
Dataset HL
|
Model result (NMI) |
Statistical significance of MfTM |
K |
TFIDF |
LDA |
T-LDA |
TOT |
DLDA |
MfTM |
TFIDF |
DLDA 50 |
DLDA 100 |
DLDA 150 |
DLDA 200 |
LDA 50 |
LDA 100 |
LDA 150 |
LDA 200 |
TOT 50 |
TOT 100 |
TOT 150 |
TOT 200 |
T-LDA 50 |
T-LDA 100 |
T-LDA 150 |
T-LDA 200 |
50 |
0.548 |
0.443 |
0.499 |
0.688 |
0.487 |
0.790 |
** |
*** |
*** |
*** |
*** |
*** |
*** |
*** |
*** |
** |
*** |
** |
*** |
*** |
*** |
*** |
*** |
100 |
0.548 |
0.545 |
0.520 |
0.679 |
0.477 |
0.769 |
*- |
** |
** |
** |
** |
*** |
*- |
** |
** |
|
|
|
|
** |
*- |
*- |
*- |
150 |
0.548 |
0.483 |
0.510 |
0.702 |
0.423 |
0.783 |
*** |
*** |
*** |
*** |
*** |
** |
*** |
*** |
*** |
*- |
** |
*- |
** |
*** |
*** |
*** |
*** |
200 |
0.548 |
0.439 |
0.525 |
0.686 |
0.402 |
0.750 |
*- |
** |
** |
** |
** |
** |
*- |
** |
** |
|
|
|
|
** |
*- |
*- |
*- |
Web-document-based SE
SE amount |
K-means - ML |
K-means - HL |
DBSCAN - ML |
DBSCAN - HL |
0H_10W |
0.3970 |
0.5382 |
0.4198 |
0.7658 |
0H_20W |
0.4093 |
0.5455 |
0.4234 |
0.7715 |
0H_30W |
0.4103 |
0.5569 |
0.4127 |
0.7625 |
0H_40W |
0.4119 |
0.5611 |
0.4393 |
0.7579 |
0H_50W |
0.4206 |
0.5566 |
0.4200 |
0.7683 |
Hashtag-based SE
|
K-means - ML |
K-means - HL |
DBSCAN - ML |
DBSCAN - HL |
SE amount |
TimeSens |
Naïve |
TimeSens |
Naïve |
TimeSens |
Naïve |
TimeSens |
Naïve |
0H_10W |
0.3970 |
0.3970 |
0.5382 |
0.5382 |
0.4198 |
0.4198 |
0.7658 |
0.7658 |
10H_10W |
0.4140 |
0.3942 |
0.5682 |
0.5591 |
0.4483 |
0.4271 |
0.7512 |
0.7552 |
20H_10W |
0.4059 |
0.3977 |
0.5666 |
0.5603 |
0.4518 |
0.4345 |
0.7697 |
0.7602 |
30H_10W |
0.4081 |
0.3972 |
0.5724 |
0.5641 |
0.4617 |
0.4549 |
0.7748 |
0.7579 |
40H_10W |
0.4085 |
0.3983 |
0.5614 |
0.5668 |
0.4351 |
0.4503 |
0.7689 |
0.7532 |
50H_10W |
0.4104 |
0.3975 |
0.5645 |
0.5585 |
0.4384 |
0.4430 |
0.7642 |
0.7643 |
Dataset ML
|
NMI |
Utility |
|
Direct |
K-means |
DBSCAN |
Direct |
K-means |
DBSCAN |
All |
0.4371 |
0.4151 |
0.4516 |
0.0000 |
0.0000 |
0.0000 |
person |
0.3975 |
0.3906 |
0.4230 |
0.0396 |
0.0245 |
0.0285 |
organization |
0.3963 |
0.3964 |
0.4251 |
0.0408 |
0.0187 |
0.0265 |
location |
0.4119 |
0.3925 |
0.4396 |
0.0252 |
0.0226 |
0.0119 |
timestamp |
0.3938 |
0.3863 |
0.4241 |
0.0433 |
0.0288 |
0.0274 |
Dataset HL
|
NMI |
Utility |
|
Direct |
K-means |
DBSCAN |
Direct |
K-means |
DBSCAN |
All |
0.5850 |
0.5760 |
0.7686 |
0.0000 |
0.0000 |
0.0000 |
person |
0.5858 |
0.5737 |
0.7394 |
-0.0008 |
0.0022 |
0.0292 |
organization |
0.5769 |
0.5654 |
0.7203 |
0.0080 |
0.0105 |
0.0483 |
location |
0.5634 |
0.5518 |
0.7396 |
0.0216 |
0.0242 |
0.0290 |
timestamp |
0.5049 |
0.4863 |
0.6357 |
0.0800 |
0.0897 |
0.1329 |
MfTM: Perplexity of online inference and Gibbs sampling
|
K=50 |
K=100 |
No. of processed posts |
Online MfTM |
Gibbs MfTM |
Online MfTM |
Gibbs MfTM |
9000 |
38954.05291 |
5436.854216 |
433560.7559 |
5394.711381 |
18000 |
11412.08046 |
5436.854216 |
168259.4554 |
5394.711381 |
28000 |
9915.462608 |
5436.854216 |
123737.2801 |
5394.711381 |
37000 |
8522.545313 |
5436.854216 |
98436.51515 |
5394.711381 |
46000 |
8360.735552 |
5436.854216 |
93890.7899 |
5394.711381 |
55000 |
8206.932254 |
5436.854216 |
81969.27728 |
5394.711381 |
65000 |
8217.231083 |
5436.854216 |
72794.09422 |
5394.711381 |
74000 |
7850.013603 |
5436.854216 |
65468.59898 |
5394.711381 |
83000 |
7619.565621 |
5436.854216 |
57456.54233 |
5394.711381 |
92000 |
7832.040264 |
5436.854216 |
50867.56732 |
5394.711381 |
102000 |
7656.407878 |
5436.854216 |
48435.60101 |
5394.711381 |
111000 |
7314.840181 |
5436.854216 |
45652.16889 |
5394.711381 |
120000 |
6952.172525 |
5436.854216 |
43837.89957 |
5394.711381 |
129000 |
6911.282114 |
5436.854216 |
40223.5684 |
5394.711381 |
139000 |
6652.048845 |
5436.854216 |
38213.09417 |
5394.711381 |
148000 |
6359.11887 |
5436.854216 |
33726.94741 |
5394.711381 |
157000 |
6378.679089 |
5436.854216 |
32082.00939 |
5394.711381 |
166000 |
6251.727982 |
5436.854216 |
29148.06264 |
5394.711381 |
176000 |
6298.507438 |
5436.854216 |
28487.10686 |
5394.711381 |
185000 |
6021.331931 |
5436.854216 |
27810.17761 |
5394.711381 |
194000 |
6229.49054 |
5436.854216 |
27507.86676 |
5394.711381 |
203000 |
6269.612359 |
5436.854216 |
26179.01307 |
5394.711381 |
213000 |
6505.408664 |
5436.854216 |
24487.13875 |
5394.711381 |
222000 |
6516.804255 |
5436.854216 |
23619.28939 |
5394.711381 |
231000 |
6548.422809 |
5436.854216 |
21628.86641 |
5394.711381 |
240000 |
6838.797775 |
5436.854216 |
19936.58102 |
5394.711381 |
250000 |
6504.31229 |
5436.854216 |
19000.39792 |
5394.711381 |
259000 |
6404.878071 |
5436.854216 |
18904.3863 |
5394.711381 |
268000 |
6296.096943 |
5436.854216 |
19083.60995 |
5394.711381 |
277000 |
6344.933808 |
5436.854216 |
18556.29485 |
5394.711381 |
287000 |
6003.767901 |
5436.854216 |
18829.10748 |
5394.711381 |
296000 |
5672.863434 |
5436.854216 |
19336.59254 |
5394.711381 |
305000 |
5572.085299 |
5436.854216 |
18675.15085 |
5394.711381 |
314000 |
5243.437613 |
5436.854216 |
18450.46253 |
5394.711381 |
324000 |
5281.43218 |
5436.854216 |
17743.68828 |
5394.711381 |
333000 |
4976.244005 |
5436.854216 |
17871.43465 |
5394.711381 |
342000 |
4923.813338 |
5436.854216 |
16745.03062 |
5394.711381 |
351000 |
4936.621638 |
5436.854216 |
16279.57859 |
5394.711381 |
361000 |
5228.118139 |
5436.854216 |
15595.54858 |
5394.711381 |
370000 |
5017.67584 |
5436.854216 |
15334.84515 |
5394.711381 |
379000 |
5165.453349 |
5436.854216 |
15453.06173 |
5394.711381 |
388000 |
5310.989257 |
5436.854216 |
14458.94203 |
5394.711381 |
398000 |
5310.232177 |
5436.854216 |
12270.04187 |
5394.711381 |
407000 |
5278.435048 |
5436.854216 |
13311.25391 |
5394.711381 |
416000 |
5038.958705 |
5436.854216 |
12698.37858 |
5394.711381 |
425000 |
5003.200908 |
5436.854216 |
12534.04177 |
5394.711381 |
435000 |
4879.546328 |
5436.854216 |
12117.14531 |
5394.711381 |
444000 |
4789.460008 |
5436.854216 |
12803.74382 |
5394.711381 |
453000 |
4625.591986 |
5436.854216 |
13851.62926 |
5394.711381 |
462000 |
4759.306322 |
5436.854216 |
13846.26884 |
5394.711381 |
472000 |
4656.385148 |
5436.854216 |
13676.13343 |
5394.711381 |
481000 |
4560.016936 |
5436.854216 |
12856.69776 |
5394.711381 |
490000 |
4487.351773 |
5436.854216 |
12629.99905 |
5394.711381 |
499000 |
4648.488226 |
5436.854216 |
12997.07871 |
5394.711381 |
509000 |
4655.472425 |
5436.854216 |
11421.4722 |
5394.711381 |
518000 |
4594.198427 |
5436.854216 |
12067.07643 |
5394.711381 |
527000 |
4706.580796 |
5436.854216 |
11956.88068 |
5394.711381 |
536000 |
4585.309551 |
5436.854216 |
11786.46867 |
5394.711381 |
546000 |
4438.855613 |
5436.854216 |
11312.7086 |
5394.711381 |
555000 |
4409.223859 |
5436.854216 |
9758.927789 |
5394.711381 |
564000 |
4328.138007 |
5436.854216 |
10218.88819 |
5394.711381 |
573000 |
4319.5537 |
5436.854216 |
10247.98738 |
5394.711381 |
583000 |
4262.659429 |
5436.854216 |
10300.74324 |
5394.711381 |
592000 |
4155.319315 |
5436.854216 |
11431.82246 |
5394.711381 |
601000 |
4212.170698 |
5436.854216 |
10931.35664 |
5394.711381 |
610000 |
4355.414886 |
5436.854216 |
10846.03247 |
5394.711381 |
620000 |
4443.916626 |
5436.854216 |
10556.711 |
5394.711381 |
629000 |
4449.646778 |
5436.854216 |
10534.19834 |
5394.711381 |
638000 |
4293.125583 |
5436.854216 |
11083.89359 |
5394.711381 |
647000 |
4293.660137 |
5436.854216 |
11190.4455 |
5394.711381 |
657000 |
4153.918992 |
5436.854216 |
10967.45041 |
5394.711381 |
666000 |
4284.363564 |
5436.854216 |
10090.8466 |
5394.711381 |
675000 |
4462.040812 |
5436.854216 |
10170.32119 |
5394.711381 |
684000 |
4278.325172 |
5436.854216 |
9990.436615 |
5394.711381 |
694000 |
4249.490423 |
5436.854216 |
8859.10066 |
5394.711381 |
703000 |
4034.002043 |
5436.854216 |
9267.554392 |
5394.711381 |
712000 |
3924.484991 |
5436.854216 |
9472.557061 |
5394.711381 |
721000 |
4169.707689 |
5436.854216 |
8991.771543 |
5394.711381 |
731000 |
4179.127236 |
5436.854216 |
8965.612967 |
5394.711381 |
740000 |
4074.761661 |
5436.854216 |
8645.167622 |
5394.711381 |
749000 |
4093.680471 |
5436.854216 |
8493.63616 |
5394.711381 |
758000 |
4035.394378 |
5436.854216 |
8733.662126 |
5394.711381 |
768000 |
4107.239718 |
5436.854216 |
9158.293843 |
5394.711381 |
777000 |
4324.118287 |
5436.854216 |
8938.728955 |
5394.711381 |
786000 |
4232.039823 |
5436.854216 |
9100.168502 |
5394.711381 |
795000 |
4248.492108 |
5436.854216 |
8818.786074 |
5394.711381 |
804000 |
4156.021246 |
5436.854216 |
8673.538321 |
5394.711381 |
814000 |
4008.661253 |
5436.854216 |
8279.599484 |
5394.711381 |
823000 |
4316.635901 |
5436.854216 |
8150.832293 |
5394.711381 |
832000 |
4310.130945 |
5436.854216 |
7871.37286 |
5394.711381 |
841000 |
4358.392727 |
5436.854216 |
10255.14363 |
5394.711381 |
851000 |
4310.564317 |
5436.854216 |
10366.42932 |
5394.711381 |
860000 |
4107.712683 |
5436.854216 |
10206.95941 |
5394.711381 |
869000 |
4131.141379 |
5436.854216 |
10015.74029 |
5394.711381 |
878000 |
4247.599424 |
5436.854216 |
10099.86774 |
5394.711381 |
888000 |
4300.235717 |
5436.854216 |
9839.907128 |
5394.711381 |
897000 |
4478.040326 |
5436.854216 |
9840.009603 |
5394.711381 |
906000 |
4517.02726 |
5436.854216 |
9806.945239 |
5394.711381 |
915000 |
4300.984055 |
5436.854216 |
9742.078066 |
5394.711381 |
925000 |
4328.470732 |
5436.854216 |
9822.567687 |
5394.711381 |
934000 |
4322.412 |
5436.854216 |
9874.627654 |
5394.711381 |
943000 |
4253.386 |
5436.854216 |
7265.004413 |
5394.711381 |
952000 |
4250.155 |
5436.854216 |
7012.479024 |
5394.711381 |
962000 |
4194.836 |
5436.854216 |
7102.557604 |
5394.711381 |
971000 |
4163.516 |
5436.854216 |
7098.197686 |
5394.711381 |
980000 |
4160.676 |
5436.854216 |
6896.701516 |
5394.711381 |
989000 |
4147.478 |
5436.854216 |
6903.719135 |
5394.711381 |
999000 |
4085.938 |
5436.854216 |
7273.609172 |
5394.711381 |
1008000 |
4041.14 |
5436.854216 |
6958.094994 |
5394.711381 |
1017000 |
4044.412 |
5436.854216 |
6893.089884 |
5394.711381 |
1026000 |
3905.233 |
5436.854216 |
6873.24313 |
5394.711381 |
1036000 |
3887.984 |
5436.854216 |
6998.064377 |
5394.711381 |
1045000 |
3904.338 |
5436.854216 |
7152.819602 |
5394.711381 |
1054000 |
3906.964 |
5436.854216 |
7683.427099 |
5394.711381 |
1063000 |
3847.375 |
5436.854216 |
7193.110886 |
5394.711381 |
1073000 |
3932.91 |
5436.854216 |
6619.446467 |
5394.711381 |
1082000 |
3848.14 |
5436.854216 |
6544.883039 |
5394.711381 |
1091000 |
3984.966 |
5436.854216 |
7143.350273 |
5394.711381 |
1100000 |
3909.894 |
5436.854216 |
7499.79215 |
5394.711381 |
1110000 |
4009.298 |
5436.854216 |
7595.999789 |
5394.711381 |
1119000 |
4249.073 |
5436.854216 |
7597.370657 |
5394.711381 |
1128000 |
4420.327 |
5436.854216 |
8153.946378 |
5394.711381 |
1137000 |
4340.078 |
5436.854216 |
8169.412494 |
5394.711381 |
1147000 |
4282.924 |
5436.854216 |
8255.788492 |
5394.711381 |
1156000 |
4437.297 |
5436.854216 |
7816.793279 |
5394.711381 |
1165000 |
4363.984 |
5436.854216 |
8109.29166 |
5394.711381 |
1174000 |
4253.124775 |
5436.854216 |
8495.123885 |
5394.711381 |
1184000 |
4177.880815 |
5436.854216 |
8534.350791 |
5394.711381 |
1193000 |
4067.586191 |
5436.854216 |
8056.088784 |
5394.711381 |
1202000 |
4102.249209 |
5436.854216 |
7565.178858 |
5394.711381 |
1211000 |
4158.634089 |
5436.854216 |
7810.974793 |
5394.711381 |
1221000 |
4162.303369 |
5436.854216 |
7892.355234 |
5394.711381 |
1230000 |
4064.209298 |
5436.854216 |
7407.609035 |
5394.711381 |
1239000 |
5030.033174 |
5436.854216 |
8662.947151 |
5394.711381 |
1248000 |
4854.904152 |
5436.854216 |
8315.425165 |
5394.711381 |
1258000 |
4855.49659 |
5436.854216 |
8242.389028 |
5394.711381 |
1267000 |
4856.068752 |
5436.854216 |
8336.052782 |
5394.711381 |
1276000 |
4809.227949 |
5436.854216 |
8157.917736 |
5394.711381 |
1285000 |
4877.147976 |
5436.854216 |
8121.026714 |
5394.711381 |
1295000 |
4867.709505 |
5436.854216 |
8143.711533 |
5394.711381 |
1304000 |
4861.294751 |
5436.854216 |
8836.776656 |
5394.711381 |
1313000 |
4874.78377 |
5436.854216 |
8817.199668 |
5394.711381 |
1322000 |
5075.378425 |
5436.854216 |
9644.091644 |
5394.711381 |
1332000 |
5084.324045 |
5436.854216 |
9676.572562 |
5394.711381 |
1341000 |
4124.784886 |
5436.854216 |
8440.436688 |
5394.711381 |
1350000 |
4222.365443 |
5436.854216 |
8721.173304 |
5394.711381 |
1359000 |
4231.637675 |
5436.854216 |
8713.090886 |
5394.711381 |
1369000 |
4186.898696 |
5436.854216 |
8489.82004 |
5394.711381 |
1378000 |
4115.195422 |
5436.854216 |
8321.261595 |
5394.711381 |
1387000 |
4120.213817 |
5436.854216 |
8289.776778 |
5394.711381 |
1396000 |
4062.055723 |
5436.854216 |
8150.702833 |
5394.711381 |
1406000 |
4016.444647 |
5436.854216 |
7394.741218 |
5394.711381 |
1415000 |
3851.146437 |
5436.854216 |
6984.060955 |
5394.711381 |
1424000 |
3602.163068 |
5436.854216 |
5947.032498 |
5394.711381 |
1433000 |
3557.385674 |
5436.854216 |
5752.579395 |
5394.711381 |
1443000 |
3545.083163 |
5436.854216 |
5566.803534 |
5394.711381 |
1452000 |
3556.769886 |
5436.854216 |
5568.951746 |
5394.711381 |
1461000 |
3598.763624 |
5436.854216 |
5799.97752 |
5394.711381 |
1470000 |
3571.985804 |
5436.854216 |
5786.085495 |
5394.711381 |
1480000 |
3547.950776 |
5436.854216 |
5720.809119 |
5394.711381 |
1489000 |
3500.54739 |
5436.854216 |
5684.279974 |
5394.711381 |
1498000 |
3551.761282 |
5436.854216 |
5883.762904 |
5394.711381 |
1507000 |
3479.999416 |
5436.854216 |
5712.844657 |
5394.711381 |
1517000 |
3742.394715 |
5436.854216 |
7122.655597 |
5394.711381 |
1526000 |
3862.144612 |
5436.854216 |
7433.759107 |
5394.711381 |
1535000 |
3947.162615 |
5436.854216 |
7576.679681 |
5394.711381 |
1544000 |
3899.202764 |
5436.854216 |
7502.51836 |
5394.711381 |
1553000 |
3941.880124 |
5436.854216 |
7618.650917 |
5394.711381 |
1563000 |
3823.183743 |
5436.854216 |
7327.030348 |
5394.711381 |
1572000 |
3833.19325 |
5436.854216 |
7322.471396 |
5394.711381 |
1581000 |
3942.391426 |
5436.854216 |
7660.962963 |
5394.711381 |
1590000 |
3896.37793 |
5436.854216 |
7534.736776 |
5394.711381 |
1600000 |
3894.874127 |
5436.854216 |
7435.579121 |
5394.711381 |
1609000 |
4010.031283 |
5436.854216 |
7739.810421 |
5394.711381 |
1618000 |
3694.545293 |
5436.854216 |
6260.827474 |
5394.711381 |
1627000 |
3721.07425 |
5436.854216 |
6346.092943 |
5394.711381 |
1637000 |
3592.616959 |
5436.854216 |
6195.620201 |
5394.711381 |
1646000 |
3596.342089 |
5436.854216 |
6258.032103 |
5394.711381 |
1655000 |
3515.007081 |
5436.854216 |
5917.602259 |
5394.711381 |
1664000 |
3524.637294 |
5436.854216 |
5842.611225 |
5394.711381 |
1674000 |
3454.117194 |
5436.854216 |
5799.59023 |
5394.711381 |
1683000 |
3414.006764 |
5436.854216 |
5593.812368 |
5394.711381 |
1692000 |
3584.028563 |
5436.854216 |
6038.695423 |
5394.711381 |
1701000 |
3500.371087 |
5436.854216 |
5876.945165 |
5394.711381 |
1711000 |
3447.495038 |
5436.854216 |
5739.319892 |
5394.711381 |
1720000 |
3528.127795 |
5436.854216 |
5832.833525 |
5394.711381 |
1729000 |
3392.938312 |
5436.854216 |
5487.020133 |
5394.711381 |
1738000 |
3470.859063 |
5436.854216 |
5554.712709 |
5394.711381 |
1748000 |
3511.176547 |
5436.854216 |
5623.317707 |
5394.711381 |
1757000 |
3429.807474 |
5436.854216 |
5550.82247 |
5394.711381 |
1766000 |
3414.069382 |
5436.854216 |
5531.986252 |
5394.711381 |
1775000 |
3484.795892 |
5436.854216 |
5678.017256 |
5394.711381 |
1785000 |
3502.412908 |
5436.854216 |
5726.089727 |
5394.711381 |
1794000 |
3504.021128 |
5436.854216 |
5510.00623 |
5394.711381 |
1803000 |
3500.780361 |
5436.854216 |
5522.832251 |
5394.711381 |
1812000 |
3606.414001 |
5436.854216 |
7120.374062 |
5394.711381 |
1822000 |
3712.775817 |
5436.854216 |
7677.900749 |
5394.711381 |
1831000 |
3753.098584 |
5436.854216 |
7937.112674 |
5394.711381 |
1840000 |
3710.468196 |
5436.854216 |
7938.483192 |
5394.711381 |
1849000 |
3650.506295 |
5436.854216 |
7817.623217 |
5394.711381 |
1859000 |
3639.066685 |
5436.854216 |
7704.672053 |
5394.711381 |
1868000 |
3656.747051 |
5436.854216 |
7864.35834 |
5394.711381 |
1877000 |
3655.12349 |
5436.854216 |
8047.393758 |
5394.711381 |
1886000 |
3650.960332 |
5436.854216 |
8021.872526 |
5394.711381 |
1896000 |
3459.00136 |
5436.854216 |
7756.125758 |
5394.711381 |
1905000 |
3479.494013 |
5436.854216 |
7838.925577 |
5394.711381 |
1914000 |
3308.144793 |
5436.854216 |
6083.507158 |
5394.711381 |
1923000 |
3214.977568 |
5436.854216 |
5616.23831 |
5394.711381 |
1933000 |
3142.33835 |
5436.854216 |
5287.639303 |
5394.711381 |
1942000 |
3132.795242 |
5436.854216 |
5196.331005 |
5394.711381 |
1951000 |
3199.06027 |
5436.854216 |
5294.177692 |
5394.711381 |
1960000 |
3313.091753 |
5436.854216 |
5590.104464 |
5394.711381 |
1970000 |
3263.728525 |
5436.854216 |
5399.837059 |
5394.711381 |
1979000 |
3283.442012 |
5436.854216 |
5133.389122 |
5394.711381 |
1988000 |
3299.612013 |
5436.854216 |
5342.940333 |
5394.711381 |
1997000 |
3351.56392 |
5436.854216 |
5410.658785 |
5394.711381 |
2007000 |
3422.514222 |
5436.854216 |
5675.166988 |
5394.711381 |
2016000 |
3438.890355 |
5436.854216 |
5708.40702 |
5394.711381 |
2025000 |
3452.687236 |
5436.854216 |
5713.086497 |
5394.711381 |
2034000 |
3554.414357 |
5436.854216 |
5975.765099 |
5394.711381 |
2044000 |
3590.115038 |
5436.854216 |
6035.264498 |
5394.711381 |
2053000 |
3499.310271 |
5436.854216 |
5842.520378 |
5394.711381 |
2062000 |
3389.280294 |
5436.854216 |
5594.602171 |
5394.711381 |
2071000 |
3444.028061 |
5436.854216 |
5712.755899 |
5394.711381 |
2081000 |
3419.410949 |
5436.854216 |
5776.169699 |
5394.711381 |
2090000 |
3378.89711 |
5436.854216 |
5532.232467 |
5394.711381 |
2099000 |
3333.40237 |
5436.854216 |
5575.957187 |
5394.711381 |
2108000 |
3316.843 |
5436.854216 |
5502.765027 |
5394.711381 |
2118000 |
3344.139663 |
5436.854216 |
5558.196631 |
5394.711381 |
2127000 |
3349.911174 |
5436.854216 |
5561.888372 |
5394.711381 |
Impact of topic amount and semantic enrichment on training time
|
K=50 |
K=100 |
K=150 |
K=200 |
MfTM-T |
00:35:33 |
00:50:33 |
01:06:00 |
01:21:41 |
MfTM-T+H |
00:37:15 |
00:53:41 |
01:10:00 |
01:24:18 |
MfTM-T+W |
00:36:19 |
00:51:49 |
01:08:00 |
01:22:08 |
MfTM-T+H+W |
00:37:48 |
00:53:23 |
01:10:00 |
01:25:13 |
Statistical analysis of the above results using linear regression:
|
Linear function |
Goodness of fit |
|
Weight |
Bias |
Chi Square |
Mean L1 Error |
Root Mean Squared Error |
MfTM-T |
2.14E-04 |
0.01386825 |
5.85E-08 |
1.20E-04 |
1.21E-04 |
MfTM-T+H |
2.19E-04 |
2.19E-04 |
6.37E-07 |
3.70E-04 |
3.99E-04 |
MfTM-T+W |
2.13E-04 |
0.01469565 |
4.08E-07 |
2.62E-04 |
3.19E-04 |
MfTM-T+H+W |
2.21E-04 |
0.0151964 |
1.60E-07 |
1.70E-04 |
2.00E-04 |
Impact of dataset size on training time
Running time |
0.5M |
1M |
1.5M |
2M |
MfTM 50 |
00:08:53 |
00:17:46 |
00:26:39 |
00:35:33 |
MfTM 100 |
00:12:38 |
00:25:16 |
00:37:55 |
00:50:33 |
MfTM 200 |
00:20:25 |
00:40:51 |
01:01:16 |
01:21:41 |
Statistical analysis of the above results using linear regression:
|
Linear function |
Goodness of fit |
|
Weight |
Bias |
Chi Square |
Mean L1 Error |
Root Mean Squared Error |
MfTM 50 |
1.23E-02 |
-5.20E-18 |
3.00E-13 |
2.50E-07 |
2.74E-07 |
MfTM 100 |
1.76E-02 |
-3.47E-18 |
5.12E-35 |
2.17E-18 |
3.58E-18 |
MfTM 200 |
2.84E-02 |
-5.00E-07 |
2.00E-13 |
2.00E-07 |
2.24E-07 |
Processing time per document (sec.) of MfTM and OG-LDA
|
MfTM |
OG-LDA |
No. of processed docs (thousands) |
K=200 |
K=50 |
K=100 |
K=150 |
K=200 |
26 |
0.00410 |
0.08977 |
0.08335 |
0.08658 |
0.09027 |
52 |
0.00387 |
0.09054 |
0.09523 |
0.09719 |
0.10615 |
78 |
0.00381 |
0.11469 |
0.11080 |
0.12838 |
0.14254 |
104 |
0.00376 |
0.14696 |
0.14548 |
0.16838 |
0.18358 |
130 |
0.00368 |
0.16512 |
0.17864 |
0.20227 |
0.22085 |
156 |
0.00364 |
0.18981 |
0.20763 |
0.23945 |
0.26475 |
182 |
0.00362 |
0.20923 |
0.23893 |
0.27965 |
0.30112 |
208 |
0.00359 |
0.23354 |
0.26919 |
0.31547 |
0.34217 |
234 |
0.00358 |
0.26619 |
0.30514 |
0.35168 |
0.38766 |
260 |
0.00353 |
0.28823 |
0.34169 |
0.39166 |
0.42834 |
Processing time per document (sec.) of MfTM
No. of processed docs (thousands) |
K=50 |
K=100 |
K=150 |
K=200 |
26 |
0.001971 |
0.003077 |
0.002820 |
0.004097 |
52 |
0.001949 |
0.002949 |
0.002748 |
0.003875 |
78 |
0.001913 |
0.002920 |
0.002736 |
0.003811 |
104 |
0.001869 |
0.002866 |
0.002756 |
0.003759 |
130 |
0.001849 |
0.002865 |
0.002770 |
0.003678 |
156 |
0.001827 |
0.002842 |
0.002758 |
0.003638 |
182 |
0.001790 |
0.002830 |
0.002751 |
0.003620 |
208 |
0.001773 |
0.002826 |
0.002732 |
0.003587 |
234 |
0.001762 |
0.002818 |
0.002734 |
0.003576 |
260 |
0.001770 |
0.002818 |
0.002726 |
0.003533 |
286 |
0.001752 |
0.002801 |
0.002715 |
0.003498 |
312 |
0.001745 |
0.002795 |
0.002707 |
0.003467 |
338 |
0.001743 |
0.002791 |
0.002706 |
0.003446 |
364 |
0.001734 |
0.002781 |
0.002783 |
0.003423 |
390 |
0.001741 |
0.002772 |
0.002798 |
0.003399 |
416 |
0.001748 |
0.002761 |
0.002792 |
0.003384 |
442 |
0.001750 |
0.002752 |
0.002778 |
0.003372 |
468 |
0.001752 |
0.002747 |
0.002758 |
0.003358 |
494 |
0.001757 |
0.002744 |
0.002751 |
0.003346 |
520 |
0.001761 |
0.002742 |
0.002748 |
0.003335 |
546 |
0.001769 |
0.002743 |
0.002767 |
0.003324 |
572 |
0.001770 |
0.002740 |
0.002782 |
0.003312 |
598 |
0.001771 |
0.002738 |
0.002772 |
0.003305 |
624 |
0.001769 |
0.002735 |
0.002764 |
0.003296 |
650 |
0.001771 |
0.002733 |
0.002769 |
0.003300 |
676 |
0.001774 |
0.002729 |
0.002795 |
0.003300 |
702 |
0.001779 |
0.002732 |
0.002793 |
0.003296 |
728 |
0.001782 |
0.002731 |
0.002782 |
0.003293 |
754 |
0.001786 |
0.002730 |
0.002799 |
0.003286 |
780 |
0.001785 |
0.002730 |
0.002808 |
0.003281 |
806 |
0.001781 |
0.002728 |
0.002795 |
0.003276 |
832 |
0.001780 |
0.002726 |
0.002793 |
0.003275 |
858 |
0.001774 |
0.002723 |
0.002785 |
0.003273 |
884 |
0.001772 |
0.002719 |
0.002798 |
0.003271 |
910 |
0.001769 |
0.002716 |
0.002828 |
0.003275 |
936 |
0.001766 |
0.002716 |
0.002851 |
0.003271 |
962 |
0.001764 |
0.002715 |
0.002854 |
0.003274 |
988 |
0.001759 |
0.002713 |
0.002871 |
0.003278 |
1014 |
0.001756 |
0.002712 |
0.002896 |
0.003275 |
1040 |
0.001752 |
0.002710 |
0.002914 |
0.003271 |
1066 |
0.001748 |
0.002708 |
0.002909 |
0.003266 |
1092 |
0.001745 |
0.002708 |
0.002900 |
0.003263 |
1118 |
0.001742 |
0.002706 |
0.002890 |
0.003258 |
1144 |
0.001738 |
0.002706 |
0.002890 |
0.003257 |
1170 |
0.001734 |
0.002703 |
0.002899 |
0.003259 |
1196 |
0.001733 |
0.002699 |
0.002893 |
0.003258 |
1222 |
0.001731 |
0.002697 |
0.002883 |
0.003257 |
1248 |
0.001728 |
0.002693 |
0.002873 |
0.003255 |
1274 |
0.001727 |
0.002692 |
0.002863 |
0.003251 |
1300 |
0.001724 |
0.002687 |
0.002855 |
0.003248 |
1326 |
0.001724 |
0.002683 |
0.002846 |
0.003245 |
1352 |
0.001722 |
0.002679 |
0.002837 |
0.003242 |
1378 |
0.001719 |
0.002676 |
0.002830 |
0.003238 |
1404 |
0.001716 |
0.002672 |
0.002823 |
0.003236 |
1430 |
0.001716 |
0.002670 |
0.002816 |
0.003233 |
1456 |
0.001716 |
0.002670 |
0.002810 |
0.003230 |
1482 |
0.001715 |
0.002671 |
0.002803 |
0.003229 |
1508 |
0.001714 |
0.002667 |
0.002797 |
0.003229 |
1534 |
0.001713 |
0.002664 |
0.002790 |
0.003226 |
1560 |
0.001710 |
0.002661 |
0.002783 |
0.003222 |
1586 |
0.001707 |
0.002658 |
0.002777 |
0.003220 |
1612 |
0.001704 |
0.002655 |
0.002770 |
0.003218 |
1638 |
0.001701 |
0.002653 |
0.002764 |
0.003223 |
1664 |
0.001700 |
0.002652 |
0.002759 |
0.003227 |
1690 |
0.001698 |
0.002652 |
0.002754 |
0.003238 |
1716 |
0.001696 |
0.002652 |
0.002749 |
0.003251 |
1742 |
0.001694 |
0.002655 |
0.002744 |
0.003258 |
1768 |
0.001694 |
0.002657 |
0.002740 |
0.003257 |
1794 |
0.001692 |
0.002656 |
0.002734 |
0.003255 |
1820 |
0.001691 |
0.002656 |
0.002728 |
0.003253 |
1846 |
0.001690 |
0.002658 |
0.002724 |
0.003253 |
1872 |
0.001688 |
0.002656 |
0.002721 |
0.003251 |
1898 |
0.001687 |
0.002654 |
0.002717 |
0.003251 |
1924 |
0.001686 |
0.002654 |
0.002714 |
0.003249 |
1950 |
0.001685 |
0.002656 |
0.002711 |
0.003248 |
1976 |
0.001684 |
0.002657 |
0.002706 |
0.003247 |
2002 |
0.001683 |
0.002659 |
0.002702 |
0.003249 |
2028 |
0.001681 |
0.002659 |
0.002698 |
0.003256 |
2054 |
0.001679 |
0.002658 |
0.002694 |
0.003264 |
2080 |
0.001677 |
0.002657 |
0.002690 |
0.003271 |
2106 |
0.001666 |
0.002642 |
0.002686 |
0.003273 |