Mondrian

1. What is a schema?

  • schema 个人理解就是一种对关系模型的描述。

  • A schema defines a multi-dimensional database. It contains a logical model, consisting of cubes, hierarchies, and members, and a mapping of this model onto a physical model. 模型形态定义了一种多维数据库,包含逻辑模型,立方体(多维数据集),层次结构,成员组成,以及这个模型到物理模型的映射。

  • The logical model consists of the constructs used to write queries in MDX language: cubes, dimensions, hierarchies, levels, and members.

  • The physical model is the source of the data which is presented through the logical model. It is typically a star schema, which is a set of tables in a relational database; later, we shall see examples of other kinds of mappings. 物理模型是通过逻辑模型呈现的数据源,它通常是星型模型,在关系型数据库中就是一系列的表。

2. Schema files

  • Mondrian schemas are represented in an XML file. An example schema, containing almost all of the constructs we discuss here, is supplied as demo/FoodMart.xml in the Mondrian distribution. The dataset to populate this schema is also in the distribution. 通过XML文件表示Mondrian模型。

  • Currently, the only way to create a schema is to edit a schema XML file in a text editor. The XML syntax is not too complicated, so this is not as difficult as it sounds, particularly if you use the FoodMart schema as a guiding example.

3. Logical model

The most important components of a schema are cubes, measures, and dimensions: 模型中最重要的组件包含:多维数据集,度量,维度。

  • A cube is a collection of dimensions and measures in a particular subject area. 一个立方体是在某个子领域维度和度量的集合

  • A measure is a quantity that you are interested in measuring, for example, unit sales of a product, or cost price of inventory items. 一个度量是一个你感兴趣需要测量的指标。

  • A dimension is an attribute, or set of attributes, by which you can divide measures into sub-categories. For example, you might wish to break down product sales by their color, the gender of the customer, and the store in which the product was sold; color, gender, and store are all dimensions. 维度是属性或者属性集。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
<Schema>
<!-- 立方体名称 -->
<Cube name="Sales">
<!-- 事实表名称 -->
<Table name="sales_fact_1997" />
<!-- 性别维度,在事实表中的外键是customer_id,在customer表的主键是customer_id -->
<Dimension name="Gender" foreignKey="customer_id">
<Hierarchy hasAll="true" allMemberName="All Genders" primaryKey="customer_id">
<Table name="customer" />
<Level name="Gender" column="gender" uniqueMembers="true" />
</Hierarchy>
</Dimension>
<!-- 时间维度 -->
<Dimension name="Time" foreignKey="time_id">
<Hierarchy hasAll="false" primaryKey="time_id">
<Table name="time_by_day" />
<Level name="Year" column="the_year" type="Numeric" uniqueMembers="true" />
<Level name="Quarter" column="quarter" uniqueMembers="false" />
<Level name="Month" column="month_of_year" type="Numeric" uniqueMembers="false" />
</Hierarchy>
</Dimension>
<!-- 单位产品销售额度量,这个值在销售事实表中,因此决定是度量还是维度是人为定义的。-->
<Measure name="Unit Sales" column="unit_sales" aggregator="sum" formatString="#,###" />
<Measure name="Store Sales" column="store_sales" aggregator="sum" formatString="#,###.##" />
<Measure name="Store Cost" column="store_cost" aggregator="sum" formatString="#,###.00" />
<!-- 计算成员,dimension="Measures"直接说明此是度量 -->
<CalculatedMember name="Profit" dimension="Measures" formula="[Measures].[Store Sales] - [Measures].[Store Cost]">
<CalculatedMemberProperty name="FORMAT_STRING" value="$#,##0.00" />
</CalculatedMember>
</Cube>
</Schema>
1
2
3
4
SELECT {[Measures].[Unit Sales], [Measures].[Store Sales]} ON COLUMNS, 查询指标度量在列
{descendants([Time].[1997].[Q1])} ON ROWS 时间维度在行
FROM [Sales] 从多维数据集Sales
WHERE [Gender].[F] 根据性别