Category Archives: Power BI

The Power BI dashboard in public preview – available also without Office #powerbi

Microsoft released a new version of Power BI in preview mode, including many new visualizations that are immediately available to all existing subscribers also in production, such as the long waited treemap, combo charts (combining line chart and column chart), and more. These features are available only in HTML5 visualizations, so you can only use the new features online. Microsoft shown these visualizations several times this year (PASS BA Conference in San Jose, and PASS Summit in Seattle), so now this is finally available to anyone. But there is much more!

Power BI Dashboard is a new service, now in public preview (unfortunately only in United States, not sure about which other countries are supported by now, certainly not Europe), that does not require an Office 365 subscription and, more important, provide a design experience on desktop also without having Excel or Office at all. In other words, there is a separate Microsoft Power BI Designer that enables you to:

  • Import data with Power Query
  • Create relationships between tables
  • Create data visualizations with Power View (running the latest HTML5 version locally in a desktop application)

This very first release does not include the full data modeling experience we are used to in Power Pivot, so you cannot create calculated columns or measures, but hopefully this will come in the next updates. In this way, you can use Power BI with a separate “data model” environment that is not tied to Excel. You can have an older version of Excel, or no Excel at all, and still design your data model with the Designer.

The goal of this app by now is to simply offer an offline design experience, and I have to say that performance of data visualization is very good. With the Designer you design data models and reports. Once published in the Power BI web site, you can “consume” data, but you can also modify the report and “pin” objects to a dashboard, so that you can build your own custom dashboard, such as the Retail Analysis Sample you can see below.

image

You can create datasets getting data from several SaaS applications, such as Dynamics CRM, Salesforce, GitHub, ZenDesk, SendGrid, and Marketo. You can also connect to live Analysis Services through a new gateway named Power BI Analysis Services Connector and use new native mobile apps for Power BI. Support for iPad should be already available (again, depending on countries, it seems not available in Europe by now). Future support for iPhone and Windows tablets has been already announced.

This is a very interesting evolution of the Power BI platform and I look forward to use it with real data and real users! Many tutorial videos are available on YouTube.

Relational Data Lake

What is a Data Lake?
Pentaho CTO James Dixon is credited with coining the term “Data Lake”. As he describes it in his blog entry, “If you think of a Data Mart as a store of bottled water – cleansed and packaged and structured for easy consumption – the Data Lake is a large body of water in a more natural state. The contents of the Data Lake stream in from a source to fill the lake, and various users of the lake can come to examine, dive in, or take samples.”

These days, demands for BI data stores are changing. BI data consumers not only require cleansed and nicely modeled data, updated on a daily basis, but also raw, uncleansed and unmodeled data which is available near real-time. With new and much more powerful tooling like Power BI, users can shape and cleanse data in a way that fits their personal needs without the help of the IT department. This calls for a different approach when it comes to offering data to these users.

BI data consumers also demand a very short time-to-market of new data, they don’t want to wait for a few months until data is made available by a BI team, they want it today. The raw uncleansed form of data in a Data Lake can be loaded very quickly because it’s suitable for generated data loading technologies and replication, which makes this short time-to-market possible. Once users have discovered the data and have acquired enough insights that they want to share with the entire organization in a conformed way, the data can be brought to traditional Data Warehouses and cubes in a predictable manner.

Furthermore there is rise in the presence of unstructured and or semi-structured data and the need to have “big data” available for adhoc analyses. To store and analyze these forms of data new technologies and data structures are required.

When the Data Lake comes in place a lot of data streams from sources into the “lake” without knowing up front if it is eligible for answering business questions. The data can’t be modeled yet, because it’s not clear how it will be used later on. Data consumers will get the possibility to discover data and find answers before they are even defined. This differs fundamentally from the concept of a Data Warehouse in which the data is delivered through predefined data structures, based on relevant business cases and questions.

Technology
From a technology view, a Data Lake is a repository which offers storage for large quantities and varieties of both unstructured, semi-structured and structured data derived from all possible sources. It can be formed by multiple underlying databases which store these different structured forms of data in both SQL and NoSQL technologies.
20141217_JK_Technologies

For the semi-structured/unstructured side of data which is used for big data analytics, Data Lakes based on Hadoop and other NoSQL technologies are common. For the semi-structured/structured data, SQL technologies are the way to go.

In this blog post I will describe the semi-structured/structured, relational appearance of the Data Lake in the form of a SQL Server database: The Relational Data Lake.
RelationalDataLake2

Extract Load (Transform)
Data in a Data Lake is in raw form. Transformations will not be performed during loading and relationships and constraints between tables will not be created which is the default for transactional replication and keeps the loading process as lean and fast as possible. Because of the lack of transformations, movement of the data follows the Extract-Load-(Transform) (EL(T)) pattern instead of the traditional E-T-L. This pattern makes loading of data to the Data Lake easier, faster and much more suitable to perform using replication technologies or generated SSIS processes, for example with BIML. This creates a very attractive time-to-market for data which is added to the Data Lake. Latency of data is as low as possible, preferable data is loaded in near real-time: data should stream into the lake continuously.

Transformations take place after the data is loaded into the Data Lake, where applicable. Cosmetic transformations like translations from technical object and column names to meaningful descriptions which end users understand or other lightweight transformations can be performed in new structures (like SQL views) that are created inside the Data Lake.

Unlike Data Marts and Data Warehouses, which are optimized for data analysis by storing only the required attributes and sometimes dropping data below the required level of aggregation, a Data Lake always retains all attributes and (if possible) all records. This way it will be future proof for solutions that will require this data in a later moment in time or for users that will discover the data.

Accessing data
Data is made accessible through structures which can either be accessed directly, or indirectly through the exposure as OData Feeds. These structures are secured and are the only objects end users or other processes have access to. The feeds can be accessed with any tool or technology that is best suited to the task at any moment in time, for example using Power BI tooling like Excel PowerPivot/PowerQuery.

We normally create SQL Views in which security rules and required transformation are applied.

The Data Lake also acts as a hub for other repositories and solutions like Data Warehouses and Operational Cubes.

Master Data
Success of the Data Lake depends on good master data. When end users discover new raw data from the Data Lake they need to be able to combine it with high quality master data to get proper insights. Therefore a master data hub is a must have when a Data Lake is created. This hub should just be a database with master data structures in it, master data management on this data is preferable but not required. The master data hub should be a standalone solution, independent from the other BI solutions, as master data isn’t part of these solutions but is only used as data source. It should be sourced independently too, preferable using master data tooling or using tools like SSIS. Just like with data from the Data Lake, master data should also only be accessed through structures which can also be exposed as OData Feeds.

Next to the purpose of combining master data with data from the Data Lake, the master data can be used as source for other BI solutions like Data Warehouses. In there, the master data structures are often used as Data Warehouse Dimensions. To prevent the unnecessary duplicate loading of master data in the Data Warehouse that already exists in the master data hub, it can be a good choice to leave the master data out of the Data Warehouse Dimensions. Only the business keys are stored which can be used to retrieve the data from the master data hub when required. This way the Data Warehouse remains slim and fast to load and master data is stored in a single centralized data store.

Architecture
The entire Data Lake architecture with all the described components are fit in the model below. From bottom to top the highlights are:

  • Extract/Load data from the sources to the Data Lake, preferably in near real-time.
  • The Data Lake can consist of multiple SQL (and NoSQL) databases.
  • Transformations and authorizations are handled in views.
  • The Data Lake acts as hub for other BI solutions like Data Warehouses and Cubes.
  • The master data hub is in the center of the model and in the center of the entire architecture. It’s loaded as a standalone solution and isn’t part of any of the other BI solutions.
  • Traditional BI will continue to exist and continue to be just as important as it has always been. It will be sourced from the Data Warehouses and cubes (and master data hub).
  • The Discovery Platform with its new Power BI tooling is the place where “various users of the lake can come to examine, dive in, or take samples.” These samples can be combined with the data from the master data hub.

20141211JK_Data Lake BI Architecture

Data Lake Challenges
Setting up a Data Lake comes with many challenges, especially on the aspect of data governance. For example it’s easy to create any view in the Data Lake and lose control on who gets access to what data. From a business perspective it can be very difficult to deliver the master data structures that are so important for the success of the Data Lake. And from a user perspective wrong conclusions can be made by users who get insights from the raw data, therefore the Data Warehouse should still be offered as a clean trusted data structure for decision makers and a data source for conformed reports and dashboards.

Summary
The Data Lake can be a very valuable data store that complements the traditional Data Warehouses and Cubes that will stay as important as they are now for many years to come. But considering the increased amount and variety of data, the more powerful self-service ETL and data modeling tooling which appear and the shortened required time-to-market of near real-time data from source up and to the user, the Data Lake offers a future proof data store and hub that enables the answering of yet undefined questions and gives users personal data discovery and shaping possibilities.

Thanks go to my Macaw colleague Martijn Muilwijk for brainstorming on this subject and reviewing this blog post.

Relational Data Warehouse + Big Data Analytics: Analytics Platform System (APS) Appliance Update 3

This blog post was authored by: Matt Usher, Senior PM on the Microsoft Analytics Platform System (APS) team

Microsoft is happy to announce the release of the Analytics Platform System (APS) Appliance Update (AU) 3. APS is Microsoft’s big data in a box appliance for serving the needs of relational data warehouses at massive scale. With this release, the APS appliance supports new scenarios for utilizing Power BI modeling, visualization, and collaboration tools over on premise data sets. In addition, this release extends the PolyBase to allow customers to utilize the HDFS infrastructure in Hadoop for ORC files and directory modeling to more easily integrate non-relational data into their data insights.

The AU3 release includes:

  • PolyBase recursive Directory Traversal and ORC file format support
  • Integrated Data Management Gateway enables query from Power BI to on premise APS
  • TSQL compatibility improvements to reduce migration friction from SQL Server SMP
  • Replatformed to Windows Server 2012 R2 and SQL Server 2014

PolyBase Directory Transversal and ORC File Support

PolyBase is an integrated technology that allows customers to utilize the skillset that they have developed in TSQL for querying and managing data in Hadoop platforms. With the AU3 release, the APS team has augmented this technology with the ability to define an external table that targets a directory structure as a whole. This new ability unlocks a whole new set of scenarios for customers to utilize their existing investments in Hadoop as well as APS to provide greater insight into all of the data collected within their data systems. In addition, AU3 introduces full support for the Optimized Row Column (ORC) file format – a common storage mechanism for files within Hadoop.

As an example of this new capability, let’s examine a customer that is using APS to host inventory and Point of Sale (POS) data in an APS appliance while storing the web logs from their ecommerce site in a Hadoop path structure. With AU3, the customer can simply maintain a structure for their logs in Hadoop in a structure that is easy to construct such as year/month/date/server/log for simple storage and recovery within Hadoop that can then be exposed as a single table to analysts and data scientists for insights.

In this example, let’s assume that each of the Serverxx folders contains the log file for that server on that particular day. In order to surface the entire structure, we can construct an external table using the following definition:

CREATE EXTERNAL TABLE [dbo].[WebLogs]
(
	[Date] DATETIME NULL,
	[Uri] NVARCHAR(256) NULL,
	[Server] NVARCHAR(256) NULL,
	[Referrer] NVARCHAR(256) NULL
)
WITH
(
	LOCATION='//Logs/',
	DATA_SOURCE = Azure_DS,
	FILE_FORMAT = LogFileFormat,
	REJECT_TYPE = VALUE,
	REJECT_VALUE = 100
);

By setting the LOCATION targeted at the //Logs/ folder, the external table will pull data from all folders and files within the directory structure. In this case, a simple select of the data will return data from only the last 10 entries regardless of the log file that contains the data:

SELECT TOP 5
	*
FROM
	[dbo].[WebLogs]
ORDER BY
	[Date]

The results are:

Note: PolyBase, like Hadoop, will not return results from hidden folders or any file that begins with an underscore (_) or period(.).

Integrated Data Management Gateway

With the integration of the Microsoft Data Management Gateway into APS, customers now have a scale-out compute gateway for Azure cloud services to more effectively query sophisticated sets of on-premises data.  Power BI users can leverage PolyBase in APS to perform more complicated mash-ups of results from on-premises unstructured data sets in Hadoop distributions. By exposing the data from the APS Appliance as an OData feed, Power BI is able to easily and quickly consume the data for display to end users.

For more details, please look for an upcoming blog post on the Integrated Data Management Gateway.

TSQL Compatibility improvements

The AU3 release incorporates a set of TSQL improvements targeted at richer language support to improve the types of queries and procedures that can be written for APS. For AU3, the primary focus was on implementing full error handling within TSQL to allow customers to port existing applications to APS with minimal code change and to introduce full error handling to existing APS customers. Released in AU3 are the following keywords and constructs for handling errors:

In addition to the error handling components, the AU3 release also includes support for the XACT_STATE scalar function that is used to indicate the current running transaction state of a user request.

Replatformed to Windows Server 2012 R2 and SQL Server 2014

The AU3 release also marks the upgrade of the core fabric of the APS appliance to Windows Server 2012 R2 and SQL Server 2014. With the upgrade to the latest versions of Microsoft’s flagship server operating system and core relational database engine, the APS appliance takes advantage of the improved networking, storage and query execution components of these products. For example, the APS appliance now utilizes a virtualized Active Directory infrastructure which helps to reduce cost and increase domain reliability within the appliance helping to make APS the price/performance leader in the big data appliance space.

APS on the Web

To learn more about the Microsoft Analytics Platform System, please visit us on the web at http://www.microsoft.com/aps

BI Announcements at PASS Summit 2014 #sqlpass #powerbi #powerpivot

This morning the PASS Summit 2014 started in Seattle and during the keynote there was several announcements from Microsoft. I’m considering here only the ones about Business Intelligence (you will find other blogs around about SQL Server).

  • In the coming months, Azure SQL Database will get new features such as column-store indexes, which can be very interesting for creating data marts on the cloud
  • Another upcoming feature in SQL Server will be an updateable columns-store index on in-memory tables. Real-time analytics will like this feature.
  • For a store analysis, an interesting demo using Kinect capturing heatmap to display which areas of a shop store have been visited more using Power Map. Just a demo, but it’s an interesting idea and the best big data demo I’ve been so far (something you can implement in the real world using big data technologies without being Twitter or Facebook).
  • New Power BI dashboards: many new visualizations and a new user interface to place data visualizations on a dashboard (similar to the grid you have in DataZen if you know that product)
    • You can connect to your data source from the cloud, without creating a local data model and sending it to the cloud
    • Q&A is integrated in the new user interface – the web site is a powerbi.com domain, it seems not in SharePoint
    • Q&A generates the report in HTML5, no Silverlight signs here
    • The entire editing is done in a web browser – a preview of that was presented at PASS BA Analytics keynote, this seems a more refined version (still not available, however)
    • TreeMap is available as a new visualizations
    • You can upload an Excel file from your disk or from OneDrive – just Excel file, no Power Pivot data model required (it is created on the fly on the cloud?)
    • Combo chart combining line and bar chart visualization available
    • Private preview now, public preview available soon
    • Request access to public preview on http://solutions.powerbi.com
  • Azure ML is publicly available for free in trial mode

The Power BI story seems the real big news. Combining this with the fact that you can query *existing* on-prem databases on Analysis Services without moving them on the cloud opens up interesting scenarios. Many questions now about when it will be available and how it will be deployed. Interesting times ahead.

Power Query support for Analysis Services (MDX)

Today at TechEd Europe 2014 Miguel Llopis made the first public show of Power Query support for Analysis Services.

This is still not available, but it should be released soon (hopefully it will be our Christmas gift!).

Here is a list of features shown:

  • It should be able to query both Multidimensional and Tabular
  • Generates query in MDX (no DAX by now)
  • Load one table at a time (but a query can mix dimensions and measures)
  • Shows dimensions, measures, hierarchies and attributes in Navigator
  • Use the typical Power Query transformations working on a “table” result
  • You import one table at a time

I think the last point deserves an explanation. When you write a query in Power Query, the result is a single table. If I want to build a Power Pivot data model getting data from an existing cube in Analysis Services, but with a different granularity, I have to run one query for each dimension and one query for the fact table. Depending on the definition of the cube, this could be easier or harder, because original columns could have been hidden because measures are exposed instead. Moreover, the result of a measure that is not aggregated with a sum (imagine just an average) could be impossible to aggregate in Power Pivot in the right way.

Thus, if you want your user to take advantage of Power Query, make sure you expose in a model measures that can be aggregated to compute non-additive calculations (such as an average!)

Now I look forward for receiving this Christmas gift!

Announcing the PASS 24 HOP Challenge

Calling all data junkies! How smart are you?  Want to get smarter?

Play along with #pass24hop Challenge on Twitter starting at 5:00 AM PT Tuesday, September 9, 2014 to win a free Microsoft Exam Voucher!  Simply watch 24 Hours of PASS and be the first to answer the question correctly. At the beginning of each 24 live 24 Hours of PASS sessions (approximately 5-8 minutes into each hour) a new question regarding the session will be posted online on the  @SQLServer Twitter account. The first tweet with the correct answer will win a prize.  Your answer must include hashtags #pass24hop and #24hopquiz.

To take part in the #pass24hop Challenge, you must:

  1. Sign in to your Twitter account. If you do not have an account, visit www.twitter.com to create one. Twitter accounts are free.
  2. Once logged into your Twitter account, follow the links and instructions to become a follower of @SQLServer.
  3. From your own account, reply your response to the question tweeted by @SQLServer.  
  4. Your tweet must contain both the #pass24hop and #24hopquiz hashtags to be eligible for entry.
  5. Your tweet must include the complete answer to the question, or it will be disqualified.
  6. The first person to correctly tweet a correct reply to the corresponding question will win the prize described below.  

Register now for 24 Hours of PASS and get ready for 24 hours of play!  

Learn more about the 24 Hours of PASS read the official rules below.

 

NO PURCHASE NECESSARY. COMMON TERMS USED IN THESE RULES:

These are the official rules that govern how the ’24 Hours of PASS Social Media Answer & Question Challenge (“Sweepstakes”) promotion will operate. This promotion will be simply referred to as the “Sweepstakes” throughout the rest of these rules. In these rules, “we,” “our,” and “us” refer to Microsoft Corporation, the sponsor of the Sweepstakes. “You” refers to an eligible Sweepstakes entrant.

WHAT ARE THE START AND END DATES?

This Sweepstakes starts at 5:00 AM PT Tuesday, September 9, 2014 and ends at 5:00 AM PT Tuesday, September 9, 2014 and ends at 7:00 AM PT Wednesday, September 10, 2014 (“Entry Period”). The Sweepstakes consists of 24 prizes. Each Prize Period will begin immediately following each of the 24 session and run for 60 minutes.  

CAN I ENTER?

You are eligible to enter this Sweepstakes if you meet the following requirements at time of entry:

· You are professional or enthusiast with expertise in SQL Server or Business Intelligence and are 18 years of age or older; and

o If you are 18 of age or older, but are considered a minor in your place of residence, you should ask your parent’s or legal guardian’s permission prior to submitting an entry into this Sweepstakes; and

· You are NOT a resident of any of the following countries: Cuba, Iran, North Korea, Sudan, and Syria.

PLEASE NOTE: U.S. export regulations prohibit the export of goods and services to Cuba, Iran, North Korea, Sudan and Syria. Therefore residents of these countries / regions are not eligible to participate

• You are NOT an employee of Microsoft Corporation or an employee of a Microsoft subsidiary; and

• You are NOT involved in any part of the administration and execution of this Sweepstakes; and

• You are NOT an immediate family (parent, sibling, spouse, child) or household member of a Microsoft employee, an employee of a Microsoft subsidiary, or a person involved in any part of the administration and execution of this Sweepstakes.

This Sweepstakes is void wherever prohibited by law.

HOW DO I ENTER?  

At the beginning of each 24 live 24 Hours of PASS sessions (approximately 5-8 minutes into each hour) a new question regarding the session will be posted online on the  @SQLServer Twitter account. The first tweet with the correct answer will win a prize.  Your answer must include hashtags #pass24hop and #24hopquiz.  Failure to use this hashtag will automatically disqualify you.

To enter, you must do all of the following:

  1. Sign in to your Twitter account. If you do not have an account, visit www.twitter.com to create one. Twitter accounts are free.
  2. Once logged into your Twitter account, follow the links and instructions to become a follower of @SQLServer
  3. From your own account, reply your response to the question tweeted by @SQLServer  
  4. Your tweet must contain both the #pass24hop and #24hopquiz hashtags to be eligible for entry
  5. Your tweet must include the complete answer to the question, or it will be disqualified.
  6. The first person to correctly tweet a correct reply to the corresponding question will win the prize described below.  

Limit one entry per person, per session.  For the purposes of these Official Rules, a “day” begins 5:00 AM PT Tuesday, September 9, 2014 and ends at 7:00 AM PT Wednesday, September 10, 2014 (“Entry Period”). If you reply with more than one answer per session, all replies received from you for that session will be automatically disqualified.  You may submit one answer to each session, but will be eligible to win only one prize within the 24 hour contest period.

We are not responsible for entries that we do not receive for any reason, or for entries that we receive but are not decipherable for any reason, or for entries that do not include your Twitter handle.

We will automatically disqualify:

  • Any incomplete or illegible entry; and
  • Any entries that we receive from you that do not meet the requirements described above.

WINNER SELECTION AND PRIZES

The first person to correctly respond will receive a Microsoft Exam Voucher.  Approximate Retail Value each $150.  A total of twenty four prizes are available.

Within 48 hours following the Entry Period, we, or a company acting under our authorization, will select one winner per session to win one free Microsoft Certification Exam.  Voucher has a retail value of $ $150.  Prize eligibility is limited to one prize within the contest period.  If you are selected as a winner for a session, you will be ineligible for additional prizes for any other session.  In the event that you are the first to answer correctly on multiple session, the prize will go to the next person with the correct answer. 

If there is a dispute as to who is the potential winner, we reserve the right to make final decisions on who is the winner based on the accuracy of the answer provided, ensuring that the rules of including hashtags is followed, and the times the answers arrives based on what times are listed on www.twitter.com.

Selected winners will be notified via a Direct Message (DM) on Twitter within 48 business hours of the daily drawing. The winner must reply to our Direct Message (DM) within 48 hours of notification via DM on Twitter. If the notification that we send is returned as undeliverable, or you are otherwise unreachable for any reason, or you do not respond within 48 business hours, we will award the prize to an alternate winner as randomly selected. Only one alternate winner will be selected and notified; after which, if unclaimed, the prize will remain unclaimed.

If you are a potential winner, we may require you to sign an Affidavit of Eligibility, Liability/Publicity Release within 10 days of notification. If you are a potential winner and you are 18 or older, but are considered a minor in your place of legal residence, we may require your parent or legal guardian to sign all required forms on your behalf. If you do not complete the required forms as instructed and/or return the required forms within the time period listed on the winner notification message, we may disqualify you and select an alternate winner.

If you are confirmed as a winner of this Sweepstakes:

  • You may not exchange your prize for cash or any other merchandise or services. However, if for any reason an advertised prize is unavailable, we reserve the right to substitute a prize of equal or greater value; and
  • You may not designate someone else as the winner. If you are unable or unwilling to accept your prize, we will award it to an alternate potential winner; and
  • If you accept a prize, you will be solely responsible for all applicable taxes related to accepting the prize; and
  • If you are otherwise eligible for this Sweepstakes, but are considered a minor in your place of residence, we may award the prize to your parent/legal guardian on your behalf.

WHAT ARE YOUR ODDS OF WINNING? 
There will be 24 opportunities to respond with the correct answer. Your odds of winning this Challenge depend on the number of responses and being the first to answer with the correct answer.

WHAT OTHER CONDITIONS ARE YOU AGREEING TO BY ENTERING THIS CHALLENGE? 
By entering this Challenge you agree:

· To abide by these Official Rules; and

· To release and hold harmless Microsoft, and its respective parents, subsidiaries, affiliates, employees and agents from any and all liability or any injury, loss or damage of any kind arising from or in connection with this Challenge or any prize won; and

· That Microsoft’s decisions will be final and binding on all matters related to this Challenge; and

· That by accepting a prize, Microsoft may use of your proper name and state of residence online and in print, or in any other media, in connection with this Challenge, without payment or compensation to you, except where prohibited by law

WHAT LAWS GOVERN THE WAY THIS CHALLENGE IS EXECUTED AND ADMINISTRATED? 
This Challenge will be governed by the laws of the State of Washington, and you consent to the exclusive jurisdiction and venue of the courts of the State of Washington for any disputes arising out of this Challenge.

WHAT IF SOMETHING UNEXPECTED HAPPENS AND THE CHALLENGE CAN’T RUN AS PLANNED? 
If cheating, a virus, bug, catastrophic event, or any other unforeseen or unexpected event that cannot be reasonably anticipated or controlled, (also referred to as force majeure) affects the fairness and / or integrity of this Challenge, we reserve the right to cancel, change or suspend this Challenge. This right is reserved whether the event is due to human or technical error. If a solution cannot be found to restore the integrity of the Challenge, we reserve the right to select winners from among all eligible entries received before we had to cancel, change or suspend the Challenge. If you attempt to compromise the integrity or the legitimate operation of this Challenge by hacking or by cheating or committing fraud in ANY way, we may seek damages from you to the fullest extent permitted by law. Further, we may ban you from participating in any of our future Challenge, so please play fairly.

HOW CAN YOU FIND OUT WHO WON? 
To find out who won, send an email to v-daconn@microsoft.com by September 15, 2014 with the subject line: “SQL Server QQ Winners

WHO IS SPONSORING THIS CHALLENGE? 
Microsoft Corporation 
One Microsoft Way 
Redmond, WA 98052

Microsoft named a Leader in Agile Business Intelligence by Forrester

We are pleased to see Microsoft acknowledged by Forrester Research as a Leader in The Forrester Wave™: Agile Business Intelligence Platforms, Q3 2014.  

We are happy to see what we believe to be an affirmation in our approach and in the strength of our technologies. Our placement in this report reflects both high scores from our clients for product vision, as well as for client feedback collected as part of the customer survey. Forrester notes that “Microsoft received high client feedback scores for its agile, business user self-service and [advanced data visualization] ADV functionality. Clients also gave Microsoft BI a high score for its product vision”. This feedback from our customers is especially gratifying to see.

Microsoft is delivering on our vision of making business intelligence more agile and accessible through the tools that people use every day. With the accessibility of Excel and the recent release of Power BI for Office 365, we aim to lower the barrier of entry for users and reduce the complexity of deploying business intelligence solutions for IT. Using Microsoft’s business intelligence solution, companies such as MediaCom have reduced time to reporting from weeks to days, Carnegie Mellon is using data to reduce energy consumption by 30%, and Helse Vest is combining hospital data to visualize trends in real time.

We appreciate the recognition of our software in this report. Above all, we value our customer’s voice in helping shape and validate this approach.

Microsoft adds forecasting capabilities to Power BI for O365

The PASS Business Analytics Conference — the event where big data meets business analytics – kicked off today in San Jose. Microsoft Technical Fellow Amir Netz and Microsoft Partner Director Kamal Hathi delivered the opening keynote, where they highlighted our customer momentum, showcased business analytics capabilities including a new feature update to Power BI for Office 365 and spoke more broadly on what it takes to build a data culture.

To realize the greatest value from their data, businesses need familiar tools that empower all their employees to make decisions informed by data. By delivering powerful analytics capabilities in Excel and deploying business intelligence solutions in the cloud through Office 365, we are reducing the barriers for companies to analyze, share and gain insight from data. Our customers have been responding to this approach through rapid adoption of our business analytics solutions — millions of users are utilizing our BI capabilities in Excel and thousands of companies have activated Power BI for Office 365 tenants.

One example of how our customers are using our business analytics tools is MediaCom, a global advertising agency which is using our technology to optimize performance and “spend” across their media campaigns utilizing data from third party vendors. With Power BI for Office 365, the company now has a unified dashboard for real-time data analysis, can share reports, and can ask natural-language questions that instantly return answers in the form of charts and graphs. MediaCom now anticipates analyses in days versus weeks and productivity gains that can add millions of dollars in value per campaign.

One of the reasons we’re experiencing strong customer adoption is because of our increased pace of delivery and regular service updates. Earlier this week we released updates for the Power Query add-in for Excel and today we are announcing the availability of forecasting capabilities in Power BI for Office 365. With forecasting users can predict their data series forward in interactive charts and reports. With these new Power BI capabilities, users can explore the forecasted results, adjust for seasonality and outliers, view result ranges at different confidence levels, and hindcast to view how the model would have predicted recent results.  

In the keynote we also discussed how we will continue to innovate to enable better user experiences through touch-optimized capabilities for data exploration. We are also working with our customers to make their existing on-premises investments “cloud-ready”, including the ability for customers to run their SQL Server Reporting Services and SQL Server Analysis Services reports and cubes in the cloud against on-premises data. For cross-platform mobile access across all devices we will add new features to make HTML5 the default experience for Power View.

To learn more about the new forecasting capabilities in Power BI for O365, go here. If you’re attending the PASS Business Analytics Conference this week, be sure to stop by the Microsoft booth to see our impressive Power BI demos and attend some of the exciting sessions we’re presenting at the event. 

First steps with Scheduled Data Refresh for Power Query #powerbi #powerquery

Just a few days before my session about Power Query at TechEd 2014, Microsoft released a new update that enables the scheduled data refresh of a Power Pivot workbook containing Power Query transformations.

This is a very good news, because it enables the data refresh of a number of different data sources. Even if the number of providers supported by this release is limited (only SQL Server and Oracle), you can use a SQL Server database as a bridge to access different data sources through views using Linked Server connections.

If you want to use this feature, first of all read carefully the Scheduled Data Refresh for Power Query blog post on MSDN web site. It guides you through are the steps required in order to enable the data source connection through the Data Management Gateway. As you will see, in reality you need to create the data source connections corresponding to the Power Query databases you use. Thus, in reality you might skip the data source configuration if you already have the corresponding databases enabled in the Power BI admin center. However, I suggest you to go through the steps described in that blog post at the beginning, because if the same database has two different drivers, it needs two different data sources. For this reason, I have a number of notes that might be helpful to avoid certain issues.

  • Power Query uses the .NET Framework Data Provider for SQL Server and Oracle Data Provider for .NET, whereas Power Pivot by default creates a SQL Server connection using the SQL Server Native Client 11.0 (SQLNCLI11).
    • Even if you already created a data source for a SQL Server database you refresh in a Power Pivot workbook, you have to create another data source for the same SQL Server database for Power Query, because you use two different drivers.
    • You might consolidate these data sources to only one, by changing the data provide in the advanced options of a Power Pivot configuration, but I am not sure this is a good idea. I would keep the two version of data sources, one for each provider, in case I use the same database in both connections
  • Power Query creates one connection string in Excel for each query you create. The connection string contains the entire transformation and when you copy it in the New Data Source page in Power BI admin, the internal query is analyzed to extract the required connection to SQL Server. If these connections are already configured as Power BI data sources, then you don’t need to do anything else. I suggest you to iterate all the queries you have following this step until you are confident of the internals and you are sure the required data sources are already available.
    • Even if you create a single query in M language accessing to different databases, the referenced connections will be found and each database will have a separate data source configuration in Power BI. I was worried that loading multiple tables from different database on the same server would have produced a single data source enabling to access all the databases on the server, but luckily this does not happen and security is preserved!
  • I spot an issue using certain DateTimeZone functions (DateTimeZone.FixedLocalNow, DateTimeZone.FixedUtcNow, DateTimeZone.LocalNow, and DateTimeZone.UtcNow) that seem not working with scheduled data refresh. You can read more about such issue in this thread on Power Query MSDN forum. I found a workaround using the Table.Buffer function, so that by stopping query folding the expression is not translated in SQL but evaluated directly by the Power Query engine. However, I hope this will be fixed soon.
  • A Power Query transformation that contains only a script, without accessing to any data source, currently is not refreshed. This would be useful for generating a Date table, I opened this other thread about this issue on the forum, I hope there will be news on that, too.
    • In the same thread you will find another tip: the literal in the form #literal, such as #table, are being mis-analyzed by scheduled refresh, but at least for this issue there are workarounds available, until the issue is not fixed by Microsoft.
  • You can use SQL Server views based on linked servers to overcome the limitation of providers currently supported by Data Management Gateway (which is the component used by scheduled data refresh).
  • Now that it is possible to publish SSIS packages as OData Feed Sources, you can expose a SQL Server view to Power BI, and accessing it from Power Pivot or Power Query, you can execute SSIS packages at refresh time. If the package is not too long to execute (it would timeout the connection), this is a smart way to arrange execution of some small “corporate ETL” in sync with the data refresh on Power BI, without relying on synchronized scheduling dates (which is always one more thing to maintain). This further extends the range of providers you can use with scheduled data refresh.

I would like to get more detailed errors when something goes wrong and scheduled data refresh stops, but this is a good start.

ICYMI: Data platform momentum

The last couple months have seen the addition of several new products that extend Microsoft’s data platform offerings.  

At the end of January, Quentin Clark outlined his vision for the complete data platform, exploring the various inputs that are driving new application patterns, new considerations for handling data of all shapes and sizes, and ultimately changing the way we can reveal business insights from data.

 In February, we announced the general availability of Power BI for Office 365, and you heard from Kamal Hathi about how this exciting release simplifies business intelligence and how features like Power BI sites and Power BI Q&A, Power BI helps anyone, not just experts, gain value from their data. You also heard from Quentin Clark about how Power BI helps make big data work for everyone by bringing together easy access to data, robust tools that everyone can use, and a complete data platform.

In March, we announced that SQL Server 2014 would be general available beginning April 1, and shared how companies are already taking advantage of in-memory capabilities and hybrid cloud scenarios that SQL Server enables. Shawn Bice explored the platform continuum, and how with this latest release, developers can continue to use SQL Server on-premises while also dipping their toes into the possibilities with the cloud using Microsoft Azure. Additionally, Microsoft Azure HDInsight was made generally available to support Hadoop 2.2, making it easy to deploy Hadoop in the cloud.

 And earlier this month at the Accelerate your insights event in San Francisco, CEO Satya Nadella discussed Microsoft’s drive towards a data culture. In addition, we announced two other key capabilities to extend the robustness of our data platform: the Analytics Platform System, an evolution of the Parallel Data Warehouse with the addition of a Hadoop region for your unstructured data, and then a preview of the Microsoft Azure Intelligent Systems Service to help tap into the Internet of Your Things. In case you missed it, watch the keynotes on-demand, and don’t miss out on experiencing the Infinity Room, to inspire you with the extraordinary things that can be found in your data.

On top of our own announcements, we’ve been recently honored to be recognized by Gartner as a Leader in the 2014 Magic Quadrants for Data Warehouse Database Management Systems and Business Intelligence and Analytics Platforms. And SQL Server 2014, in partnership with Hewlett Packard, set two world records for data warehousing performance and price/performance.

With these enhancements across the entire Microsoft data platform, there is no better time than now to dig in. Learn more about our data platform offerings. Brush up on your technical skills for free on the Microsoft Virtual Academy. Connect with other SQL Server experts through the PASS community. Hear from Microsoft’s engineering leaders about Microsoft’s approach to developing the latest offerings. Read about the architecture of data-intensive applications in the cloud computing world from Mark Souza, which one commenter noted was a “great example for the future of application design/architecture in the Cloud and proof that the toolbox of the future for Application and Database Developers/DBAs is going to be bigger than the On-Prem one of the past.” And finally, come chat in-person – we’ll be hanging out at the upcoming PASS Business Analytics and TechEd events and are eager to hear more about your data opportunities, challenges, and of course, successes.

What can your data do for you?