Bulkload XML Code: Caching doesn't seem to be working

I have a small Cube (about 100MB size) with about 10 dimensions.

I have created YTD calculated members (few of them) like this:

SUM
( PeriodsToDate (
[end_date].[Financial].[FYear],
TAIL (EXISTING [end_date].[Financial].[FYYYYMM].Members, 1).item(0)
),
[Measures].[LOS]
)

Now if I do a simple MDX like this:

SELECT

NON EMPTY ([end_date].[FYYYYMM Attribute].[200708]

) ON COLUMNS ,

NON EMPTY ([specialty].[SubSpecialtyDesc Attribute].allmembers,

[purchase_unit].[PUC].allmembers)

ON ROWS

FROM [cube_contract_reporting]

WHERE (

[Measures].[ActualRawVolumeYTD],

[purchase_unit].[WIES_PU].[WIES PU].&,

[end_date].[Financial].[FYear].&[2007],

[purchaser].[PHFundedFlag Attribute].& Yes

)

It returns the first run result in 8 minutes. Which is long, however second run takes 50 seconds, which makes Okay.

if I run something like this:

SELECT

NON EMPTY ([end_date].[FYYYYMM Attribute].allmembers

) ON COLUMNS ,

NON EMPTY ([specialty].[SubSpecialtyDesc Attribute].allmembers,

[purchase_unit].[PUC].allmembers)

ON ROWS

FROM [cube_contract_reporting]

WHERE (

[Measures].[ActualRawVolumeYTD],

[purchase_unit].[WIES_PU].[WIES PU].&,

[end_date].[Financial].[FYear].&[2007],

[purchaser].[PHFundedFlag Attribute].& Yes

)

it will take few hours for the first run and about an hour for the second run...

is this a caching problem? or am I doing something wrong?

I have created so many aggregations, and tried every trick I can think of or found online.... nothing changed the performance!

Actually the caching does appear to be working as the second runs are all noticably faster than the initial run. It appears to be the calculation that is slowing things down. There are some things that can be done to tune YTD style calculations, but they ususally rely on having a natural date hierarchy. We need to know a few things about your cube to see how best to tune it.

Is [LOS] a base measure or a calculation?

How fast do the queries run if you use a base measure instead of the YTD calculation?

Does the [end_date].[FYYYYMM Attribute] attribute have a relationship to the [end_date].[FYear] attribute?

Do the [puchase_unit].[WIES_PU] and [purchase_unit].[PUC] attributes have a relationship defined between them?

|||

Darren Gosbell wrote:

Actually the caching does appear to be working as the second runs are all noticably faster than the initial run. It appears to be the calculation that is slowing things down. There are some things that can be done to tune YTD style calculations, but they ususally rely on having a natural date hierarchy. We need to know a few things about your cube to see how best to tune it.

Is [LOS] a base measure or a calculation?

How fast do the queries run if you use a base measure instead of the YTD calculation?

Does the [end_date].[FYYYYMM Attribute] attribute have a relationship to the [end_date].[FYear] attribute?

Do the [puchase_unit].[WIES_PU] and [purchase_unit].[PUC] attributes have a relationship defined between them?

I haven't received the email notification so I didn't know of you reply, sorry for being late.

[LOS] is a base measure, however LOSYTD is calculated memeber (among about 20 YTD fields)

How fast on base measures : Very Fast, 6 seconds for the very same query with LOS instead of LOSYTD

There is a Financial Hierarchy that holds the fiscal year dates FYear-> FQTR->FYYYYMM->Date

No relationship exists between Weis_PU and PUC.

Also one "interesting" note. I noticed that if I have only ONE YTD calculation in the cube then the first run takes long (say 50 min.) and the second run takes about 50 seconds! (which is long but bearable).

Thank you for helping me out.

|||

Yes, there seems to have been an issue with the notifications. I got none for a number of days and then over 20 of them came through today.

The number of YTD calculations in the cube should not really make a difference, I can't think why this should impact on performance as calculations are only executed as they are referenced. This is a bit of a concern, but I can't think what could be causing this behaviour.

Seeing that you do have relationships (and presumably a user hierarchy) defined on the Financial date, you might be able to optimize it using the following technique http://cwebbbi.spaces.live.com/Blog/cns!1pi7ETChsJ1un_2s41jm9Iyg!111.entry. This basically "walks" up the date hierarchy, so that if you are on month 11, rather than adding up 11 months, it adds up 3 quarters and 2 months, which can significantly speed up your calculation. (although SP2 might do this without a code change)

Are you on SP2? There were some performance enhancements particularly aimed at running sum and YTD style calculations that my help significantly (see: http://www.sqljunkies.com/WebLog/mosha/archive/2006/11/17/rsum_performance.aspx)

If "feels" as if there are not appropriate aggregations at the month level and that the code is being forced to go down to the leaf level. The SP2 fix mentioned above may help here as may running the usage based optimisation wizard.

Also, just looking back at your calculation. Do you really need the EXISTING operator in there? Neither of your sample queries appeared to require it and it would probably run much faster with a reference to currentmember.

Code Snippet

SUM
( PeriodsToDate (
[end_date].[Financial].[FYear],
[end_date].[Financial].[FYYYYMM].CurrentMember
),
[Measures].[LOS]
)

This of course gives you a relative YTD, but if you want to replicate the fixed year to date that I think you probably get with the the Existing function then you could do something like the following. I don't think this would handle the ALL member properly, but you could fix that with an appropriate SCOPE statement to handle that case.

Code Snippet

SUM
( PeriodsToDate (
[end_date].[Financial].[FYear],
Tail([end_date].[Financial].[FYYYYMM].CurrentMember.Siblings,1).item(0)
),
[Measures].[LOS]
)

|||

Hi,

Thank you for your help. I will certainly push for SP2 as it seems to be a good solution.

The reason behind using "existing" is to allow the user to multi-select. In a common scenario, a business user would say what was the YTD for a particular month in a particular year, and then try another month and another...etc...

The first code you used would fail on multisets because it can't determine the current member, and I suppose the second will too, but I'm not sure.

I don't know what is happening BUT I did the following:

1. Removed a dimension with a huge set of records that we used for testing only (that is not used in the MDX AT ALL)

2. Partitioned the data based on financial year.

The same MDX (no change) is doing the first run in 30 min. and the second in 1 min.

I will see if SP2 further brings this down and let you know. Thanks for your help!

|||

No, neither of the calcs I suggested in the previous post will work in a multi select scenario. They will work with multiple members on the axis, but not if you have multiple months in a set in the where clause. It was just that your sample query showed selection by year, so I though maybe the multi-select capability might not be strictly required.

Adding partitioning makes some sense as your queries were only hitting a single year, so this would cause the storage engine to have to scan less data. Even though SSAS does not require you to set the data slice for a partition, there has been some evidence recently that doing so can improve performance so you might want to make sure you are doing this.

Removing a large dimension that was not part of the query would also reduce the overall size of the cube, but may also point to the fact that the query might not be finding appropriate aggregations. The updated samples with SP2 includes an aggregation designer and while you would want to be really careful about manually designing your own aggregations, this tool does a good job of presenting the existing aggregations in a GUI which helps you easily see which attributes are included in the designed aggregations.

One final suggestion that I just thought of was that sometimes breaking your calculation into smaller pieces can help as the individual bits can be cached separately by the formula engine. So possibly breaking out the finding of the last month member from the actual periods to date sum would speed things up too. It's hard to know for sure if this will have much of an impact or not, but it should be fairly easy for you to test

Code Snippet

CREATE MEMBER CURRENTCUBE.[end_date].[Financial].[CurrentEndMonth]

AS TAIL (EXISTING [end_date].[Financial].[FYYYYMM].Members, 1).item(0)

SUM
( PeriodsToDate (

[end_date].[Financial].[FYear],
[end_date].[Financial].[CurrentEndMonth]

),
[Measures].[LOS]
)

Thursday, March 8, 2012

Caching doesn't seem to be working

No comments:

Post a Comment

Bulkload XML Code

Blog Archive

About Me