The story of a very expensive filter

One of my customer reported a problem that caused one of their child nothing server to run at 100% CPU and consume almost all memory (out of 32GiB available).

I first looked at the timing (it was reported last Friday) and I thought this was possibly linked to the PMImport release as last week we had PAtch Tuesday (so we released the PMImport wednesday and replicated it to the child server Thursday evening.

But this was not it. First the memory ballooning problem happened on 3 different processes: the w3wp pools for the Altiris-NS-Agent and TaskManagement as well as the AeXSvc itself.

With all three processes running we would see large chunks of memory being released in a clean drop and go right back up in after nice curve. This was because the 3 processes were fighting for the scarce memory resources and causing each other to have to be scavenged every now and then.

Stopping on of the application pool pegged the memory to ~12 GiB for each of the other two processes, restoring access to the console but not resolving the problem.

In the end we found that this was a re-occurence of an issue seen in November (before I was on the account) caused by a "rogue" filter.

The following SQL allowed us to find and clean up the culprit:

select top 1 collectionguid, count(*)
  from collectionmembership
 group by collectionguid
having count(*) > 1000000
 order by count(*) desc

delete from collectionmembership where collectionguid = <guid found above>

This return 2.9 Million entries!

So the deletion took 25 minutes to run, but after restarting the application pools and Altiris Service all was back to work. We looked at the audit information on the filter and the person who last modified it had not changed anything specific from their recollection.

However we saw from the edit view (whilst deletion was running) that the filter was set to "Query Mode: Query Builder" mode instead of "Query Mode: none" (as the filter is used for patch targetting and we only need to do filter inclusions or exclusions.

When the same happened today we quickly fixed the issue, but the user again confirmed that he had not done anything bad.

So I tested this on my server and had the same problem: when the query mode is set to Query Builder if you save it as is (without modifying anything) all resources are included in the filter.

This doesn't matter on my test system as it can cope with a low 25,000 objects in the cache. But in a large environment the 2.9 million items were fully replicated in memory (we use a complete cache for the collection membership cache) on 3 different process - demanding an awful lot of resources and grinding the server to a halt.

The story of a very expensive filter

Trending Articles

モーツァルトディヴェルティメント変ホ長調 K.563 の名盤

井上貴博アナウンサー彼女や結婚の噂は？実家や親が話題？人気は？

Ke Aloha Kalikimakaの歌詞を和訳します

PaliのLepe `Ula`ulaと歌詞の和訳

2014年6月6日号　三菱東京ＵＦＪ銀行（5月14日付）

LNK2019:未解決の外部シンボルと LNK1120:外部参照 1 が未解決について

ヴァンパイア・ノーツ　攻略

大阪・泉南イオンで飛び降り自殺とみられる転落事件が発生：ネットで拡散された理由とは

メールディーラーで受信するアドレスを追加できますか？

Robocopy のエラー (戻り値) について

林要の結婚や経歴&評判とWikiプロフやLOVOT(ラボット)とグルーブエックス株価は

【極☆寒】「凍った髪」を競い合う『国際ヘア・フリージング・コンテスト』！寒〜い写真に身震いしつつ過ぎ行く冬にサヨナラだ!!

滋賀の部落（同和地区）一覧

【銃刀法違反】吉田総業組長代行恩田達志容疑者を再逮捕

和歌山県代表決まる　都道府県対抗中学バレー

大浦街道で重体事故

【世界大学ランキング】第１位にジュリアード音楽院とウィーン国立音大、日本勢は？

【対策済】「SKYSEA Client View」のアップデートに失敗する問題についてのお知らせ

Lahaina Lunaの歌詞を和訳しました

画像・写真】ららぽーと横浜で16歳男子高校生が転落死不審な動き→逃走し警備員に追いかけられ→柵越え飛び降り・12m転落窃盗・万引き？それとも盗撮？