Monthly Archives: November 2014

Power Query now connects to Analysis Services #ssas #mdx #dax

The last update of Power Query finally supports Analysis Services Multidimensional and Tabular. I waited this version for a very long time, but now it’s finally here!

Chris Webb already wrote an interesting blog post with several useful link and informations. 

You can connect to both Multidimensional and Tabular, but queries are generated in MDX. For this reason, I consider Multidimensional as a primary source by now. Many improvement can be done for multidimensional, whereas Tabular would benefit from DAX support at first.

I want to share my feedback and I already look forward to future improvements: please use the comment to this blog post to share your priorities for these features I would like to see.

Expose Keys

Each Hierarchy in Multidimensional has one or more levels, corresponding to dimension attributes.
Each attribute has up to three properties:

  • Key
    • Can be multipart, which means it has one or more columns, each column can have a different data type
  • Name
    • This is always a string. If not defined, it corresponds to the Key, which must have a single column. If the attribute has multipart key, the Name has to be defined in the Multidimensional model
  • Value
    • This property is optional and is not used often. However, it could represent the name in a numeric value and it’s used when the key cannot be used to represent the corresponding value. This property is not shown to the user but can be used in MDX calculations.

The Name is the one shown in the user interface of power Query. However, when you extract data from a cube, most of the times you need the key column(s) in order to create relationships with other query. For example, a very common scenario is creating three tables, corresponding to two dimensions and a fact table. The keys used to connect the dimension tables to the fact table are mandatory, but are not exposed in the attribute name visible in the user interface of a cube.
Thus, exposing the Key is very important. Please remember that the Key can be a multipart key, so you might have one or more columns to add.
If a user request the Key, by default I would show all the columns of a multipart key so he can remove those columns that are not required (but usually you’ll need all of them). Since the Value is not requested so often, I would expose it as a separate menu.

Surrogate Keys and Other Hidden Attributes

Depending on the cube design, it would be nice exposing those attributes that are hidden in the cube.
For example, in a well-designed cube, the model should not expose the surrogate keys in a visible attribute, because this would create a dependency in reports that would break a filter based on a surrogate key in case the relational tables are reprocessed and surrogate keys are regenerated. The general idea is that a surrogate key does not have a semantic meaning. Thus, it shouldn’t be exposed to an end user.
However, if you are importing several tables from a Multidimensional model, trying to create one table per dimension and one table per fact table (maybe by changing the cardinality, reducing the number of dimensions used), then you should import the surrogate keys too, at least for SCD type 2. Unfortunately, there are no info in the cube model that could help you discriminating between SCD1 and SCD2, so I’m not sure about what could be the best practice in this case. Probably, hidden attributes should be accessible only by advanced users, exposing them by default could be really confusing and I would avoid that.
This is an area where it’s hard to make a right choice, a compromise is required.

Related Properties

An attribute can have other properties related to it. By default, all browsable parent attributes can be considered attribute’s properties. However, when we talk about attribute’s properties we usually refer to the non-browsable attribute. Each non-browsable attribute is just another attribute. Non-browsable attributes are not shown in the list of attributes of a dimension, but they should be available to the user that want to import data for a certain column. The user interface could be designed in several ways for that:

  • Show attribute’s properties in a window from which the user can drag&drop – but maybe confusing – the UI should show only properties existing for a particular attribute and each attribute might have different properties. User interface might be a concern here.
  • Automatically import all the properties of an attribute (maybe by asking confirmation to the end user?) when adding that attribute to a query. Then the user can remove the columns that are not required in the Query.

Show Formatted Measures

Sometime it might be useful to import the formatted value of a measure. However, I would not replace the number imported today with the formatted value, because the latter could be a string that does not have any semantic meaning. Providing the option of importing the formatted measure as an additional column in the PowerQuery window would be very welcome, but don’t just replace one with the other.

Invisible attributes

Currently, Power Query shows all the attributes of a dimension, which in general is a good thing. However, I would put a checkbox that shows/hides invisible attributes. By default, I would show only visible attributes, because this is what user would be more familiar with. THe “show invisible attributes/columns” should be an advanced view.

Multiple selection without measures

if you select attributes from several dimensions without selecting a measure, you obtain as a result the Crossjoin between the tables you selected. In my opinion, this is counterintuitive and useless: I cannot imagine a use case where this would be meaningful. A much better solution would be importing every dimension as a single table, just as you do when you select many tables from SQL Server. It is the user that will handle joins between table, if necessary. My suggestion is to keep the existing behavior (import a single table) only when you import also a measure, even if I would like to be able to import all the dimensions and the set of measures as separate tables in the data model – creating one query for each dimension and one query for each measure group (or for each cube – not sure about the better approach here).
 

Getting Started with HDInsight – Part 1 – Introduction to HDInsight

I have read the previous tips on Big Data Basics series of tips and have an overview of Big Data, Hadoop, and the related concepts. My current employer is more of a Microsoft shop and we use Microsoft technologies day-in and day-out. My employer is planning to implement one of our upcoming projects using Hadoop. So I would like to know about Microsoft’s Hadoop offering and get prepared before the start of the project.

Archiving Azure Automation logs to Azure BLOB Storage

For the past few months at work I’ve been diving headlong into a service on Microsoft Azure called Azure Automation which is, to all intents and purposes, a service for hosting and running Powershell scripts (properly termed “runbooks”). I’ve got a lot to say and a lot to share about Azure Automation and this is the first such post.

Each time Azure Automation runs a runbook (a running instance of a runbook is termed a job) it stores a log of that job however that log isn’t readily available as a log file on a file system as one might be used to. There are two ways to access the log, either via the Azure Management Portal at http://manage.windowsazure.com or programatically using Powershell cmdlets. Both are handy however I wanted the logs available as a file in Azure BLOB Storage so that they could be easily viewed and archived (in the future we’ll be building Hive tables over the top of the logs – remind me to blog about that later once its done). I have written a runbook called Move-AzureAutomationRunbookLogs that will archive logs for a given runbook and for a given time period into a given Azure BLOB Storage account and I’ve made it available on the Azure Automation gallery:

image

https://gallery.technet.microsoft.com/scriptcenter/Archive-Azure-Automation-898a1aa8

It does pretty much what it says on the tin so if you’re interested hit the link and check it out. For those that are simply interested in reading the code I’ve provided a screenshot of it below too (suggestions for how to copy-paste Powershell code into Live Writer so that it keeps its formatting are welcome!!). There’s a screenshot of the output too if that’s the sort of thing that floats your boat.

After this archiving runbook has run you’ll find two folders under the root folder that you specified:

image

and inside each of those is a folder for each runbook whose logs you are collecting (the folder contains a file per job). The “job” folder contains a short summary of each job however the “joboutput” folder contains the good stuff – the actual output from the run.

One last point, if you’re going to use this runbook I highly recommend doing so in conjunction with a technique outlined by Eamon O’Reilly at Monitoring Azure Services and External Systems with Azure Automation. Eamon’s post describes a technique for running operations such as these on a schedule and its something that I‘ve found to be very useful indeed.

Hope this is useful!

@Jamiet

image

image

SQL Server 2014 Overview on DBTA.com

Haven’t upgraded late? Well, SQL Server 2014 is a great place to start! Perhaps this is your first time reading my column on DBTA.com or you don’t know much about SQL Server. If either of those are true, then it might be a surprise to you that Microsoft has accelerated the release cadence for SQL Server to around one new release every 18 to 24 months. Add in the fact that Microsoft goes to market with their beta releases, better known as Community Technology Preview (CTPs), several months before…(read more)

Rett Syndrome Research needs help

Very proud of my daughters this week. As some of you will know, one of my daughters has Rett syndrome. It’s a rare chromosomal disorder that basically only impacts girls. (Boys can get it but they usually do not survive long after birth). Girls develop fairly normally up to about 12 to 18 months of age, then regress markedly. As it affects only about 1 in 10,000 girls, it’s not the sort of thing that gets much research funding.

So my other two daughters have decided to raise some funding to help with a local Rett research program. If anyone is interested, you can make a donation directly here: https://www.mycause.com.au/…/fundraiserforrettsyndromeresea…

Alternately, my youngest daughter is organising a staged musical for early next year (in Brisbane) with all profits again going directly to the Rett research team. I’ll post more about it closer to when tickets for the musical are available.

Perth SQL Server User Group this Thursday 5PM

Looking forward to seeing the Perth locals at the SQL Server user group this Thursday night at 5PM.

The topic that the group has chosen is an Introduction to SQL Server High Availability Options.

It’s on at the BHP offices at 125 St Georges Terrace. If you are coming, please ping Jenny to let her know.

I look forward to seeing many of you there.

Find SQL Server Agent Jobs That Exist on One Instance and Not on Another Instance

We have multiple SQL Servers across development, QA and production and we need to quickly determine what SQL Server Agent jobs exist in one environment, but not the other. This will help us determine if these jobs are necessary, currently in development, or being tested in development before landing in production. In this tip we look at how to use PowerShell to compare SQL Agent jobs across different SQL Server instances to find jobs that exist on one server and not another.

What are useful tools and resources for DAX developers? #dax #powerpivot #tabular

At the last PASS Summit I received an interesting question: is there a list of all the useful (I would say necessary…) tools for DAX developers? My answer was… “no, but this is an interesting topic for a blog post”.

In the meantime, I thought that a page to keep updated would have been better, and of course an easy-to-remember URL is equally important. So here is the URL where you will find an updated list of tools and resources useful to any DAX developer:

http://sql.bi/daxtools

Of course, feedback are welcome!