Pháp triển các giải thuật song song trong khai phá luật kết hợp
Tóm tắt Pháp triển các giải thuật song song trong khai phá luật kết hợp: ... va` die`ˆu kieˆ.n co . so.’ cu’a no´ deˆ’ sa’n sinh taˆ´t ca’ ca´c maˆ˜u phoˆ’ bieˆ´n lieˆn quan, vo´.i n la` soˆ´ ta´c vu. . Cho W = {w1, w2, . . . , wn} la` taˆ.p tro.ng soˆ´ vo´ .i wi = weight(ti) la` khoˆ´i lu.o.. ng coˆng vieˆ.c deˆ’ xu.’ ly´ ta´c vu. ti cho deˆ´n khi keˆ´t thu´c. Tro.ng ...ma˘.c da`ˆu chu´ng chı’ kha´c nhau ve`ˆ soˆ´ deˆ´m doˆ. hoˆ˜ tro . . . Nhu˜ .ng du.`o.ng di na`y co´ theˆ’ du.o.. c keˆ´t ho . .p va`o trong moˆ. t du .`o.ng di duy nhaˆ´t vo´.i soˆ´ deˆ´m la` toˆ’ng soˆ´ deˆ´m cu’a taˆ´t ca’ ca´c du.`o.ng di. Keˆ´t qua’ cuoˆ´i cu`ng khoˆng bi. a’nh hu .o.’...a`ˆn que´t thu´. hai maˆ˜u die`ˆu kieˆ.n co . so.’ . Khi xaˆy du.. ng ma traˆ.n deˆ´m doˆ´i vo´ .i FP-Tree Tχ (χ 6= ∅), khoˆng co´ su.. trao doˆ’i thoˆng tin giu˜ .a ca´c boˆ. xu .’ ly´ v`ı caˆy va` maˆ˜u die`ˆu kieˆ.n co . so.’ ca`ˆn thieˆ´t deˆ’ xaˆy du.. ng caˆy hieˆ.n dieˆ.n trong cu`ng ...
tr`ınh cu’a die`ˆu kieˆ.n co . so.’ doˆ´i vo´.i moˆ. t mu. c deˆ’ sa’n sinh taˆ´t ca’ ca´c maˆ˜u phoˆ’ bieˆ´n lieˆn quan. Do do´ t´ınh ha.t cu’a ta´c vu. du .o.. c xa´c di.nh bo .’ i soˆ´ bu.´o.c la˘.p deˆ’ sa’n sinh ca´c maˆ˜u die`ˆu kieˆ.n co . so.’ . Deˆ’ xa´c di.nh t´ınh ha.t cu’a ta´c vu. co´ nhieˆ.u qua’ nhu . theˆ´ na`o doˆ´i vo´.i hieˆ.u na˘ng song song, xe´t t`ınh huoˆ´ng khi moˆ. t boˆ. xu .’ ly´ phu.c vu. ca´c yeˆu ca`ˆu coˆng vieˆ.c. Moˆ.t boˆ. xu .’ ly´ dang hoa.t doˆ.ng Pi co´ theˆ’ phu. c vu. ca´c yeˆu ca`ˆu coˆng vieˆ.c sau khi no´ da˜ hoa`n tha`nh vieˆ.c xu .’ ly´ moˆ.t ta´c vu. trong khoˆ´i cu’a no´. Neˆ´u tro.ng soˆ´ cu’a ta´c vu. na`y qua´ nho’ , lu .o.. ng truye`ˆn thoˆng co´ theˆ’ ta˘ng theˆm bo .’ i v`ı moˆ. t soˆ´ lu .o.. ng lo´ .n ta´c vu. co´ theˆ’ du .o.. c gu .’ i di. Ma˘.t kha´c, neˆ´u tro.ng soˆ´ cu’a ta´c vu. na`y qua´ lo´ .n, ca´c boˆ. xu .’ ly´ yeˆu ca`ˆu pha’ i do.. i moˆ. t lu .o.. ng tho` .i gian da´ng keˆ’ cho deˆ´n khi Pi co´ theˆ’ phu. c vu. cho nhu˜.ng yeˆu ca`ˆu coˆng vieˆ.c. Trong ca’ hai tru .`o.ng ho.. p, toa`n boˆ. hieˆ.u na˘ng co´ theˆ’ gia’m. Deˆ’ kieˆ’m soa´t t´ınh ha.t cu’a ta´c vu. , thuaˆ.t toa´n DLB kho .’ i ta.o moˆ. t tham bieˆ´n go. i la` maxload. Nhu˜.ng ta´c vu. vo´ .i t´ınh ha.t lo´ .n ho.n ho.n maxload se˜ du.o.. c chia tha`nh ca´c ta´c vu. nho’ ho .n (ca´c die`ˆu kieˆ.n co . so.’ keˆ´ tieˆ´p nho’ ho.n). Ca´c maˆ˜u die`ˆu kieˆ.n co . so.’ keˆ´ tieˆ´p la` ca´c ta´c vu. doˆ. c laˆ.p (co´ theˆ’ du.o.. c xu .’ ly´ doˆ.c laˆ.p cho deˆ´n khi nhu˜ .ng maˆ˜u phoˆ’ bieˆ´n du.o.. c sa’n sinh), va` tro. ng soˆ´ cu’a chu´ng co´ theˆ’ du.o.. c do tru .´o.c khi thu.. c hieˆ.n. Ca´c maˆ˜u die`ˆu kieˆ.n co . so.’ keˆ´ tieˆ´p vo´.i tro.ng soˆ´ nho’ ho.n maxload se˜ du.o.. c xu .’ ly´ cho deˆ´n khi hoa`n tha`nh. Taˆ´t ca’ ca´c maˆ˜u die`ˆu kieˆ.n co . so.’ keˆ´ tieˆ´p co´ tro.ng soˆ´ lo´ .n ho.n maxload se˜ du.o.. c da˘.t va`o trong nga˘n xeˆ´p cu. c boˆ. deˆ’ xu .’ ly´ hoa˘.c gu .’ i to´.i nhu˜.ng boˆ. xu .’ ly´ kha´c trong tu.o.ng lai. 194 NGUYE˜ˆN LONG GIANG Gia´ tri. phu` ho . . p cu’a tham bieˆ´n maxload du .o.. c xa´c di.nh trong thu . . c nghieˆ.m va` se˜ du .o.. c de`ˆ caˆ.p trong pha`ˆn keˆ´t qua’ thu . . c nghieˆ.m. Qua’n ly´ tho`.i gian phu. c vu. Vieˆ.c kieˆ’m soa´t t´ınh ha.t cu’a ta´c vu. co´ theˆ’ gia’m thieˆ’u da´ng keˆ’ tho` .i gian cho`. trong khi nhaˆ.n coˆng vieˆ.c. Tuy nhieˆn, nhu . da˜ tr`ınh ba`y o.’ pha`ˆn da`ˆu, tro.ng soˆ´ cu’a moˆ.t ta´c vu. khoˆng theˆ’ do du.o.. c ch´ınh xa´c tru .´o.c khi thu.. c hieˆ.n. Xe´t t`ınh huoˆ´ng khi boˆ. xu .’ ly´ Pj gu .’ i moˆ. t yeˆu ca`ˆu coˆng vieˆ.c cho boˆ. xu .’ ly´ Pi sau khi Pi da˜ ba˘´t da`ˆu thu . . c hieˆ.n nhieˆ.m vu. mo´ .i, Pj pha’ i do . . i cho deˆ´n khi Pi keˆ´t thu´c coˆng vieˆ.c cu’a no´. Keˆ´t qua’ la`, boˆ. xu .’ ly´ Pj co´ theˆ’ vaˆ˜n co`n bi. cha˘.n trong moˆ. t tho` .i gian da`i. Deˆ’ gia’ i quyeˆ´t vaˆ´n de`ˆ na`y, ta su.’ du. ng ky˜ thuaˆ. t sau. Thay v`ı chı’ phu.c vu. ca´c yeˆu ca`ˆu coˆng vieˆ.c sau khi coˆng vieˆ.c cu’a no´ da˜ hoa`n tha`nh, neˆ´u moˆ. t yeˆu ca`ˆu gu .’ i deˆ´n trong tho`.i gian da`ˆu xu.’ ly´ pha`ˆn tu.’ mo´.i, boˆ. xu .’ ly´ Pi ta.m tho` .i du`.ng la. i va` phu. c vu. ca´c yeˆu ca`ˆu coˆng vieˆ.c gu.’ i deˆ´n, sau do´ tieˆ´p tu. c thu . . c thi coˆng vieˆ.c cu’a no´, vieˆ.c na`y se˜ gia’m thieˆ’u tho` .i gian cho`. va` nhu. vaˆ.y ca’ i thieˆ.n caˆn ba`˘ng ta’ i. Toˆ´i u.u ho.n Da ma traˆ.n deˆ´m. Nhu . da˜ tha’o luaˆ.n o .’ treˆn, vieˆ.c su .’ du. ng ma traˆ.n deˆ´m co´ theˆ’ gia’m thieˆ’u tho`.i gian duyeˆ.t FP-Tree khi xaˆy du . . ng FP-Tree die`ˆu kieˆ.n keˆ´ tieˆ´p. Trong gia’ i thuaˆ.t caˆn ba`˘ng ta’ i t˜ınh, ta chı’ xaˆy du.. ng ma traˆ.n deˆ´m cho FP-Tree da`ˆu tieˆn Ti. Ro˜ ra`ng, y´ tu .o.’ ng su.’ du. ng ma traˆ.n deˆ´m co´ theˆ’ du .o.. c a´p du.ng cho baˆ´t ky` FP-Tree Tχ na`o. Chu´ y´ ra`˘ng maˆ˜u die`ˆu kieˆ.n co . so.’ khi ta xaˆy du.. ng Tχ co´ theˆ’ coi nhu . moˆ. t taˆ.p du˜ . lieˆ.u. Khoˆng gioˆ´ng ma traˆ.n deˆ´m doˆ´i vo´ .i Ti ma` du.o.. c na.p tu` . moˆ. t teˆ.p lu .u tru˜., ma traˆ.n deˆ´m doˆ´i vo´ .i Tχ se˜ du .o.. c xaˆy du . . ng trong suoˆ´t la`ˆn que´t thu´. hai maˆ˜u die`ˆu kieˆ.n co . so.’ . Khi xaˆy du.. ng ma traˆ.n deˆ´m doˆ´i vo´ .i FP-Tree Tχ (χ 6= ∅), khoˆng co´ su.. trao doˆ’i thoˆng tin giu˜ .a ca´c boˆ. xu .’ ly´ v`ı caˆy va` maˆ˜u die`ˆu kieˆ.n co . so.’ ca`ˆn thieˆ´t deˆ’ xaˆy du.. ng caˆy hieˆ.n dieˆ.n trong cu`ng boˆ. xu .’ ly´. Thu´. tu.. xu .’ ly´ ta´c vu. . Trong DLB, thu´ . tu.. ca´c ta´c vu. trong danh sa´ch cu.c boˆ. du .o.. c xu .’ ly´ co´ theˆ’ ta´c doˆ.ng to´ .i toa`n boˆ. hieˆ.u na˘ng cu’a gia’ i thuaˆ. t. Xe´t t`ınh huoˆ´ng khi la`ˆn da`ˆu tieˆn, boˆ. xu .’ ly´ Pi xu .’ ly´ nhu˜.ng ta´c vu. co´ tro.ng soˆ´ nho’ . Giai doa.n da`ˆu, khoˆng co´ yeˆu ca`ˆu coˆng vieˆ.c tu` . ca´c boˆ. xu .’ ly´ kha´c gu.’ i deˆ´n Pi. Ca´c yeˆu ca`ˆu coˆng vieˆ.c chı’ deˆ´n khi Pi keˆ´t thu´c vieˆ.c xu .’ ly´ taˆ´t ca’ ca´c ta´c vu. co´ tro.ng soˆ´ nho’ va` ba˘´t da`ˆu vo´ .i ta´c vu. co´ tro.ng soˆ´ lo´ .n ho.n trong khoˆ´i cu’a no´, die`ˆu na`y co´ theˆ’ ba˘´t ca´c boˆ. xu .’ ly´ cho`. laˆu tru.´o.c khi chu´ng nhaˆ.n coˆng vieˆ.c mo´ .i tu`. Pi. Vı` ly´ do do´, ca´c ta´c vu. trong moˆ˜i khoˆ´i du .o.. c sa˘´p xeˆ´p theo thu´ . tu.. gia’m da`ˆn cu’a tro.ng soˆ´ deˆ’ xu .’ ly´ ca´c ta´c vu. lo´.n ho.n tru.´o.c. Tra´nh bi. cha˘.n khi trao doˆ’i ca´c maˆ˜u die`ˆu kieˆ.n co . so.’ . Sau khi ca´c FP-Tree cu. c boˆ. du .o.. c xaˆy du.. ng trong taˆ´t ca’ ca´c boˆ. xu .’ ly´, nhu˜.ng tie`ˆn toˆ´ du.`o.ng di laˆ´y tu`. nhu˜.ng caˆy ca`ˆn pha’ i trao doˆ’i giu˜.a ca´c boˆ. xu .’ ly´ deˆ’ h`ınh tha`nh ca´c maˆ˜u die`ˆu kieˆ.n co . so.’ . Tieˆ´n tr`ınh trao doˆ’i na`y do`i ho’ i su.. phoˆ´i ho . . p cu’a taˆ´t ca’ ca´c boˆ. xu .’ ly´ va` daˆ˜n to´.i vieˆ.c bi. cha˘.n. Deˆ’ gia’m bo´ .t kha’ na˘ng bi. cha˘.n, moˆ˜i boˆ. xu .’ ly´ cho.n ngaˆ˜u nhieˆn moˆ.t boˆ. xu .’ ly´ d´ıch deˆ’ gu.’ i thay v`ı su.’ du. ng lu .o.. c doˆ` h`ınh tro`n. Taˆ´t ca’ ca´c tie`ˆn toˆ´ du.`o.ng di lieˆn quan deˆ´n moˆ. t pha`ˆn tu .’ trong moˆ. t boˆ. xu .’ ly´ se˜ du.o.. c do´ng go´i tru.´o.c khi gu.’ i di. Ky˜ thuaˆ.t boˆ. deˆ.m cu˜ng du .o.. c su .’ du. ng deˆ’ gia’m thieˆ’u kha’ na˘ng bi. cha˘.n. PHA´T TRIEˆ’N CA´C GIA’ I THUAˆ. T SONG SONG TRONG KHAI PHA´ LUAˆ. T KE´ˆT HO. . P 195 4. NGHIEˆN CU´ . U HIEˆ. U NA˘NG Ca´c keˆ´t qua’ thu.. c nghieˆ.m du .o.. c thu . . c hieˆ.n treˆn PC Atlantis Cluster goˆ`m co´ 16 nu´t. Moˆ˜i nu´t co´ 2 boˆ. xu .’ ly´ Intel Xeon 2.8GHz, 2GB boˆ. nho´ ., va` oˆ’ cu´.ng 80GB. Ca´c nu´t du.o.. c keˆ´t noˆ´i vo´.i nhau bo.’ i Ethernet co´ da’ i thoˆng 1Giga-bits/s. Co. so.’ du˜. lieˆ.u du .o.. c lu .u treˆn d˜ıa cu. c boˆ. cu’a moˆ˜i nu´t. Chu.o.ng tr`ınh u´.ng du. ng du .o.. c laˆ.p tr`ınh ba`˘ng ngoˆn ngu˜ . C. Su.’ du. ng giao thu´ .c truye`ˆn tin-MPI. deˆ’ thu.. c hieˆ.n truye`ˆn thoˆng. Deˆ’ da´nh gia´ hieˆ.u na˘ng, ta su .’ du.ng ca´c taˆ.p du˜ . lieˆ.u gia’ di.nh kha´c nhau du .o.. c sinh ra ba`˘ng vieˆ.c su .’ du. ng thu’ tu. c du .o.. c moˆ ta’ trong[2]. Ca´c thuoˆ.c t´ınh cu’a chu´ng du .o.. c moˆ ta’ trong Ba’ng 3. Ba’ng 3. Thuoˆ.c t´ınh taˆ.p du˜ . lieˆ.u Taˆ.p du˜ . lieˆ.u T I D N Kı´ch thu .´o.c (MB) T15I4D2000K 15 4 1,500,000 10,000 223 T20I6D1000K 20 6 1,000,000 10,000 197 T25I10D200K 25 10 2,000,000 10,000 492 T25I15D150K 25 15 1,500,000 10,000 369 T25I20D100K 25 20 1,000,000 10,000 249 T: Doˆ. da`i trung b`ınh cu’a giao di.ch I: Doˆ. da`i trung b`ınh cu’a taˆ.p pha`ˆn tu .’ phoˆ’ bieˆ´n D: Soˆ´ giao di.ch N: Soˆ´ pha`ˆn tu.’ T´ınh hieˆ.u qua’ cu’a vieˆ.c su .’ du. ng ma traˆ.n deˆ´m Trong ba`i ba´o na`y, ta so sa´nh hieˆ.u na˘ng cu’a gia’ i thuaˆ.t FP-growth ban da`ˆu vo´ .i gia’ i thuaˆ.t su.’ du. ng ky˜ thuaˆ.t ma traˆ.n deˆ´m. Hı`nh 7 cho thaˆ´y tho` .i gian thu.. c hieˆ.n cu’a hai gia’ i thuaˆ.t treˆn na˘m taˆ.p du˜ . lieˆ.u cu’a moˆ.t nu´t. Trong h`ınh 7, gia’ i thuaˆ. t da`ˆu tieˆn co´ teˆn ”Serial”, va` gia’ i thuaˆ.t vo´.i ma traˆ. n deˆ´m co´ teˆn ”Serial-CM”. Doˆ. hoˆ˜ tro . . toˆ´i thieˆ’u 0.1%. Ta thaˆ´y ra`˘ng vo´ .i vieˆ.c su .’ du. ng ma traˆ.n deˆ´m, gia’ i thuaˆ.t thu . . c hieˆn toˆ´t ho .n gia’ i thuaˆt ban da`ˆu treˆn taˆ´t ca’ ca´c taˆ.p du˜ . lieˆ.u. Trong taˆ´t ca’ ca´c th´ı nghieˆ.m kha´c vo´ .i nhu˜.ng gia’ i thuaˆ.t song song du . . a treˆn caˆy, khi de`ˆ caˆ.p deˆ´n gia’ i thuaˆ.t FP-growth, ta su .’ du.ng gia’ i thuaˆ.t FP-growth vo´ .i ky˜ thuaˆ. t ma traˆ. n deˆ´m. H`ınh 7. Hieˆ.u qua’ cu’a vieˆ.c su .’ du. ng ma traˆ.n deˆ´m Hieˆ.u qua’ cu’a vieˆ.c su .’ du. ng ky˜ thuaˆ.t caˆy ba˘m Nghieˆn cu´.u hieˆ.u na˘ng kieˆ’m tra t´ınh hieˆ.u qua’ cu’a vieˆ.c su .’ du. ng ky˜ thuaˆ.t caˆy ba˘m khi taˆ´t 196 NGUYE˜ˆN LONG GIANG ca’ ca´c boˆ. xu .’ ly´ trao doˆ’i ca´c die`ˆu kieˆ.n co . so.’ . Ta deˆ´m soˆ´ tie`ˆn toˆ´ du.`o.ng di ma` moˆ˜i boˆ. xu .’ ly´ pha’ i gu.’ i to´.i ca´c boˆ. xu .’ ly´ kha´c va` so sa´nh no´ vo´.i soˆ´ du.`o.ng di sau khi du.o.. c ho . .p nhaˆ´t ba`˘ng vieˆ.c su .’ du. ng ky˜ thuaˆ.t caˆy ba˘m. Vieˆ.c kieˆ’m tra du .o.. c thu . . c hieˆ.n treˆn taˆ.p du˜ . lieˆ.u T25I20D1000K vo´ .i doˆ. hoˆ˜ tro . . toˆ´i thieˆ’u la` 0.1%. Nhu˜ .ng keˆ´t qua’ du.o.. c du .a ra o.’ Ba’ng 4. Ba’ng 4. Hieˆ.u qua’ cu’a vieˆ.c su .’ du. ng ky˜ thuaˆ.t caˆy ba˘m Soˆ´ nu´t Boˆ. xu .’ ly´ Tru.´o.c khi ho.. p nhaˆ´t Sau khi ho . . p nhaˆ´t 2 0 144,639 3,753 1 148,873 3,951 0 112,369 4,637 1 115,407 4,821 4 2 113,148 4,570 3 112,295 4,732 0 68,332 4,277 1 67,047 4,294 2 67,938 4,191 3 68,839 4,405 8 4 67,331 4,304 5 68,160 4,222 6 67,660 4,323 7 68,570 4,409 0 37,890 3,441 1 37,779 3,584 2 37,114 3,467 3 37,323 3,476 4 37,834 3,512 5 38,223 3,474 6 39,032 3,672 16 7 38,041 3,535 8 37,126 3,518 9 38,163 3,581 10 37,693 3,472 11 37,652 3,435 12 37,670 3,510 13 37,119 3,433 14 37,636 3,514 15 39,494 3,710 Tu`. Ba’ng 4 co´ theˆ’ thaˆ´y ra`˘ng, ba`˘ng vieˆ.c ho . . p nhaˆ´t ca´c tie`ˆn toˆ´ du .`o.ng di gioˆ´ng nhau lieˆn quan deˆ´n ca´c mu. c trong ba’ng tieˆu de`ˆ, soˆ´ du .`o.ng di ca`ˆn thieˆ´t deˆ’ gu.’ i deˆ´n ca´c boˆ. xu .’ ly´ kha´c gia’m thieˆ’u kha´ nhie`ˆu, va` nhu. vaˆ.y, chi ph´ı truye`ˆn thoˆng cu˜ng gia’m thieˆ’u da´ng keˆ’. Gia´ tri. toˆ´i u .u cu’a maxload Nhu. da˜ tr`ınh ba`y trong pha`ˆn da`ˆu, gia´ tri. maxload co´ theˆ’ a’nh hu .o.’ ng deˆ´n hieˆ.u na˘ng cu’a DLB. Khi maxload qua´ lo´.n, vieˆ.c maˆ´t caˆn ba`˘ng ta’ i se˜ ta˘ng theˆm. Ngu .o.. c la. i, neˆ´u maxload qua´ nho’ , lu.o.. ng truye`ˆn thoˆng co´ theˆ’ ta˘ng leˆn, ngoa`i ra co´ qua´ nhie`ˆu maˆ˜u die`ˆu kieˆ.n co . so.’ nho’ du.o.. c lu .u tru˜. trong nga˘n xeˆ´p cu.c boˆ. , do do´ la`m ta˘ng tho` .i gian t´ınh toa´n. Tuy nhieˆn, thaˆ. t kho´ deˆ’ xa´c di.nh gia´ tri. toˆ´i u .u cu’a maxload khi no´ a’nh hu.o.’ ng deˆ´n moˆ.t thuoˆ.c t´ınh cu’a taˆ.p du˜ . lieˆ.u-k´ıch thu .´o.c maˆ˜u phoˆ’ bieˆ´n trung b`ınh. PHA´T TRIEˆ’N CA´C GIA’ I THUAˆ. T SONG SONG TRONG KHAI PHA´ LUAˆ. T KE´ˆT HO. . P 197 Thu.. c hieˆ.n moˆ.t soˆ´ th´ı nghieˆ.m treˆn nhu˜ .ng taˆ.p du˜ . lieˆ.u kha´c nhau deˆ’ da´nh gia´ mu´ .c doˆ. a’nh hu.o.’ ng cu’a maxload deˆ´n hieˆ.u na˘ng. Keˆ´t qua’ th´ı nghieˆ.m cho keˆ´t luaˆ.n ra`˘ng, gia’ i thuaˆ. t da.t hieˆ.u na˘ng toˆ´t nhaˆ´t neˆ´u maxload = 2/3 k´ıch thu.´o.c maˆ˜u phoˆ’ bieˆ´n trung b`ınh. Hı`nh 8 minh ho.a keˆ´t qua’ thu.’ nghieˆ.m treˆn taˆ.p du˜ . lieˆ.u T25I20D1000K co´ k´ıch thu .´o.c maˆ˜u phoˆ’ bieˆ´n trung b`ınh I = 20. Tieˆ´n ha`nh do tho`.i gian thu.. c hieˆ.n giai doa.n hai cu’a gia’ i thuaˆ. t (khai tha´c tu` . FP-Tree) v`ı tham bieˆ´n maxload khoˆng a’nh hu.o.’ ng to´.i giai doa.n da`ˆu cu’a gia’ i thuaˆ.t (xaˆy du . . ng caˆy cu.c boˆ.). Trong nghieˆn cu´ .u na`y, doˆ. hoˆ˜ tro . . toˆ´i thieˆ’u la` 0.1%, va` soˆ´ lu .o.. ng nu´t la` 4. Gia’ i thuaˆ. t da.t du.o.. c hieˆ.u na˘ng toˆ´t nhaˆ´t khi maxload=12. H`ınh 8. A’ nh hu.o.’ ng cu’a vieˆ.c su .’ du. ng ca´c gia´ tri. maxload kha´c nhau Khoa’ng chia doˆ. hoˆ˜ tro . . toˆ´i thieˆ’u cu’a nhu˜ .ng gia’i thuaˆ.t du . . a treˆn caˆy Nghieˆn cu´.u na`y so sa´nh t´ınh oˆ’n di.nh cu’a SLB va` DLB khi doˆ. hoˆ˜ tro . . toˆ´i thieˆ’u gia’m tu` . 0.25% deˆ´n 0.08%. Ca´c th´ı nghieˆ.m du .o.. c thu . . c hieˆ.n treˆn 8 nu´t vo´ .i hai taˆ.p du˜ . lieˆ.u, T25I10D2000K va` T25I20D1000K. Keˆ´t qua’ du.o.. c minh ho.a trong Hı`nh 9. H`ınh 9. Khoa’ng chia doˆ. hoˆ˜ tro . . toˆ´i thieˆ’u Khi doˆ. hoˆ˜ tro . . toˆ´i thieˆ’u gia’m, ca´c taˆ.p mu. c phoˆ’ bieˆ´n trong ca’ hai taˆ.p du˜ . lieˆ.u ta˘ng vo´ .i caˆ´p lu˜y thu`.a. Co´ kha´ nhie`ˆu taˆ.p mu. c phoˆ’ bieˆ´n da`i cu˜ng nhu . soˆ´ lo´.n ca´c taˆ.p mu. c phoˆ’ bieˆ´n nga˘´n 198 NGUYE˜ˆN LONG GIANG trong ca´c taˆ.p du˜ . lieˆ.u. Treˆn taˆ.p du˜ . lieˆ.u T25I10D2000K, doˆ. chia cu’a SLB va` DLB la` ba`˘ng nhau Tuy nhieˆn, doˆ. chia DLB toˆ´t ho .n ho.n SLB treˆn taˆ.p du˜ . lieˆ.u T25I20D1000K. Die`ˆu na`y la` do, vo´.i taˆ.p du˜ . lieˆ.u T25I20D1000K, khi doˆ. hoˆ˜ tro . . toˆ´i thieˆ’u gia’m, soˆ´ taˆ.p phoˆ’ bieˆ´n da`i ta˘ng doˆ.t ngoˆ.t so vo´ .i taˆ.p du˜ . lieˆ.u T25I10D2000K. Nhu . da˜ tr`ınh ba`y trong pha`ˆn tru.´o.c, vieˆ.c do tro.ng soˆ´ cu’a ca´c FP-Tree die`ˆu kieˆ.n da`i hoa˘.c ca´c taˆ.p phoˆ’ bieˆ´n da`i khoˆng ch´ınh xa´c ba`˘ng vieˆ.c do FP-Tree die`ˆu kieˆ.n va` ca´c taˆ.p phoˆ’ bieˆ´n nga˘´n va` do do´ daˆ˜n to´ .i vieˆ.c maˆ´t caˆn ba`˘ng ta’ i. Ta˘ng toˆ´c SLB va` DLB Hı`nh 10 chı’ ra vieˆ.c ta˘ng toˆ´c cu’a gia’ i thuaˆ.t SLB va` DLB treˆn hai taˆ.p du˜ . lieˆ.u khi doˆ. hoˆ˜ tro.. toˆ´i thieˆ’u la` 0.1%. Deˆ’ nghieˆn cu´ .u hieˆ.u na˘ng, ta giu˜ . CSDL khoˆng thay doˆ’i va` thay doˆ’i soˆ´ lu.o.. ng boˆ. xu .’ ly´. Vieˆ.c ta˘ng toˆ´c tu .o.ng doˆ´i du.o.. c do ba˘`ng vieˆ.c su .’ du. ng phu .o.ng tr`ınh Sp = T1/Tp vo´.i Sp la` su . . ta˘ng toˆ´c da.t du .o.. c vo´ .i p boˆ. xu .’ ly´, T1 la` tho` .i gian thu.. c hieˆ.n tua`ˆn tu . . va` Tp la` tho` .i gian thu.. c hieˆ.n su .’ du. ng p boˆ. xu .’ ly´. Nhu. doˆ` thi. minh ho.a, ca’ SLB va` DLB de`ˆu ta˘ng toˆ´c ve`ˆ hieˆ.u na˘ng raˆ´t toˆ´t. Tuy nhieˆn, treˆn hai taˆ.p du˜ . lieˆ.u, DLB co´ ty’ leˆ. ta˘ng toˆ´c toˆ´t ho .n. Co´ theˆ’ thaˆ´y ra`˘ng, vieˆ.c ta˘ng soˆ´ lu .o.. ng boˆ. xu .’ ly´, heˆ. soˆ´ go´c cu’a du .`o.ng toˆ´c doˆ. vu .o.. t qua gio´ .i ha.n. Do´ la` v`ı, khi moˆ˜i nu´t xu.’ ly´ soˆ´ lu.o.. ng du˜ . lieˆ.u nho’ , tho` .i gian truye`ˆn thoˆng chieˆ´m pha`ˆn da´ng keˆ’ trong toˆ’ng tho`.i gian thu.. c hieˆ.n. Co´ ngh˜ıa la`, co´ theˆ’ da.t du .o.. c ty’ leˆ. ta˘ng toˆ´c toˆ´t ho .n vo´.i nhu˜.ng taˆ.p du˜. lieˆ.u lo´ .n ho.n trong soˆ´ lo´.n ca´c giao di.ch, khi do´, tho` .i gian truye`ˆn thoˆng se˜ ı´t a’nh hu.o.’ ng. H`ınh 10. Ta˘ng toˆ´c hieˆ.u na˘ng cu’a ca´c gia’ i thuaˆ.t du . . a treˆn caˆy PHA´T TRIEˆ’N CA´C GIA’ I THUAˆ. T SONG SONG TRONG KHAI PHA´ LUAˆ. T KE´ˆT HO. . P 199 5. KEˆ´T LUAˆ. N Ba`i ba´o na`y da˜ tr`ınh ba`y qua´ tr`ınh pha´t trieˆ’n hai gia’ i thuaˆ.t song song du . . a treˆn phu .o.ng pha´p FP-growth. Gia’ i thuaˆ.t SLB du`ng ky˜ thuaˆ.t ma traˆ.n deˆ´m deˆ’ u .´o.c lu.o.. ng tro.ng soˆ´ cu’a ca´c ta´c vu. song song du .o.. c xu .’ ly´, va` do do´ caˆn ba˘`ng ta’ i giu˜.a ca´c boˆ. xu .’ ly´. Ky˜ thuaˆ.t ma traˆ.n deˆ´m cu˜ng giu´p gia’m bo´.t tho`.i gian duyeˆ.t FP-Tree va` gia’m thieˆ’u thoˆng tin du . thu`.a trao doˆ’i giu˜.a ca´c boˆ. xu .’ ly´. SLB lieˆn quan deˆ´n ca´c ba`i toa´n truye`ˆn thoˆng mu´.c cao ba˘`ng vieˆ.c su .’ du. ng ky˜ thuaˆ.t caˆy ba˘m deˆ’ keˆ´t ho . . p taˆ´t ca’ ca´c tie`ˆn toˆ´ du .`o.ng di gioˆ´ng nhau lieˆn quan deˆ´n ca´c pha`ˆn tu.’ phoˆ’ bieˆ´n laˆ´y tu`. FP-Tree. Gia’ i thuaˆ.t DLB la` gia’ i pha´p caˆn ba`˘ng ta’ i doˆ.ng du .o.. c xaˆy du . . ng du . . a treˆn gia’ i thuaˆ. t SLB. DLB keˆ´ thu`.a taˆ´t ca’ ca´c u.u dieˆ’m cu’a gia’ i thuaˆ.t SLB. Ho .n nu˜.a, DLB co´ kha’ na˘ng phaˆn phoˆ´i la. i ta’ i tro.ng coˆng vieˆ.c giu˜ .a ca´c boˆ. xu .’ ly´ khi hieˆ.n tu .o.. ng maˆ´t caˆn ba`˘ng ta’ i xaˆ’y ra. Moˆ.t soˆ´ gia’ i pha´p toˆ´i u.u du.o.. c pha´t trieˆ’n deˆ’ giu´p DLB da.t du .o.. c hieˆ.u na˘ng t´ınh toa´n cao ho .n. Ca´c gia’ i thuaˆ.t khai tha´c du˜ . lieˆ.u song song SLB va` DLB pha´t trieˆ’n treˆn phu .o.ng pha´p FP-growth da˜ du.o.. c de`ˆ xuaˆ´t trong [1], [4], [8], [11]. Tuy nhieˆn, ca´c gia’ i thuaˆ. t na`y ga˘.p nhie`ˆu ha.n cheˆ´ ve`ˆ hieˆ.u na˘ng va` tho` .i gian thu.. c hieˆ.n. Do´ng go´p ch´ınh cu’a ba`i ba´o la` ca’ i tieˆ´n gia’ i thuaˆ.t DLB ba`˘ng vieˆ.c su .’ du. ng ky˜ thuaˆ.t da ma traˆ. n deˆ´m va` die`ˆu chı’nh thu´ . tu.. xu .’ ly´ ca´c ta´c vu. . Ho .n nu˜.a, ba`i ba´o tieˆ´n ha`nh ca`i da˘.t va` thu .’ nghieˆ.m ca´c gia’ i thuaˆ. t treˆn boˆ. soˆ´ lieˆ.u thu .’ nghieˆ.m deˆ’ da´nh gia´ hieˆ.u na˘ng cu’a ca´c gia’ i thuaˆ.t, tu` . do´ du.a ra ca´c keˆ´t luaˆ.n ve`ˆ t´ınh hieˆ.u qua’ cu’a vieˆ.c su .’ du. ng da ma traˆ.n deˆ´m, t`ım du .o.. c gia´ tri. toˆ´t nhaˆ´t cu’a tham soˆ´ maxload deˆ’ gia’ i thuaˆ.t da.t hieˆ.u na˘ng toˆ´t nhaˆ´t. Ba`i ba´o cu˜ng thu.’ nghieˆ.m moˆ´i lieˆn quan giu˜ .a soˆ´ lu.o.. ng ca´c boˆ. xu .’ ly´ du.o.. c su .’ du. ng, doˆ. lo´ .n cu’a boˆ. soˆ´ lieˆ.u thu .’ nghieˆ.m, doˆ. hoˆ˜ tro . . toˆ´i thieˆ’u vo´ .i hieˆ.u na˘ng cu’a ca´c gia’ i thuaˆ.t ba`˘ng vieˆ.c thu . . c hieˆ.n ta˘ng toˆ´c gia’ i thuaˆ.t SLB va` DLB. Keˆ´t qua’ thu .’ nghieˆ.m chı’ ra ra`˘ng ca’ SLB laˆ˜n DLB co´ su.. ta˘ng toˆ´c ve`ˆ hieˆ.u na˘ng raˆ´t toˆ´t. Giu˜ .a hai gia’ i thuaˆ.t, DLB thu . . c hieˆ.n toˆ´t ho .n ho.n SLB, da˘.c bieˆ.t treˆn nhu˜ .ng taˆ.p du˜ . lieˆ.u raˆ´t lo´ .n vo´.i doˆ. hoˆ˜ tro . . toˆ´i thieˆ’u nho’ . TA`I LIEˆ. U THAM KHA ’O [1] R. Agrawal and J. Shafer, Parallel mining of association rules, IEEE Transactions on Knowledge and Data Eng., USA, 1996 (962–969). [2] R. Agrawal, T. Imielinski, and A. Swami, Mining association rules between sets of items in large databases, Proc. of the ACM SIGMOD Conf.Washington DC, USA, May, 1993. [3] E.W. Dijkstra, W.H. Seijen, and A. J.M. Van Gasteren, Derivation of a termination detection algorithm for a distributed computation, Proc. of the ACM Transactions on Programming Languages and Systems, ACM Press, New York, USA, 1993 (1–35). [4] E.H. Han, G. Karypis, and V. Kumar, Scalable parallel data mining for association rules, Proceedings of the 1997 ACM SIGMOD international conference on Management of data, ACM Press, New York, USA, 1997 (277–288). [5] J. Han, J. Pei, and Y. Yin, Mining frequent patterns without candidate generation, Proc. of the ACM SIGMOD Conf. on Management of Data, ACM Press, New York, USA, 2000 (1–12). [6] A. Mueller, Fast sequential and parallel algorithms for association rule mining: A com- parison, Technical Report CS-TR-3515, College Park, MD, 1995. 200 NGUYE˜ˆN LONG GIANG [7] J. Park, M. Chen, and P. Yu., An effective hash-based algorithm for mining association rules, Proc. ACM SIGMOD Conf. Management of Data, San Jose, California, United States, 1995 (175–186). [8] J. Park, M. Chen, and P. Yum, An efficient parallel data mining for association rules, Proceedings of the 4th International Conference on Information and Knowledge Man- agement, ACM Press, New York, USA, 1995 (31–36). [9] A. Savasere, E. Omiecinski, and S. Navathe, An efficient algorithm for mining association rules in large databases, Proc. of the 21st VLDB Conf. Zurich, Swizerland, 1995. [10] R. Srikant and R. Agrawal, Mining sequential patterns: Generalizations and performance improvements, Proc. of the Fifth Intel Conference on Extending Database Technology, Springer-Verlag, USA, 1996 (3–17). [11] M. J. Zaki, S. Parthasarathy, and W. Li, A localized algorithm for parallel association mining, ACM Symposium Parallel Algorithms and Architectures,ACMPress, New York, USA, 1997 (321–330). Nhaˆ. n ba`i nga`y 3 - 4 - 2007
File đính kèm:
- phap_trien_cac_giai_thuat_song_song_trong_khai_pha_luat_ket.pdf