In latest release (SAP HANA SP09) SAP released new SAP HANA feature – Multitenant Database Containers (MDC). I am personally excited by this news and I am seeing there quite a huge potential. But what are the real use cases for this feature? Is it making SAP HANA really cloud product which can be used to serve different customers? Or it is feature for one customer which can help him consolidate his workloads...
I am getting these questions quite often and honestly I did not yet formed my own opinion myself. Although this might be little illogical I decided to write a blog on this topic because it might be good way to draw some attention to this point and maybe (hopefully) trigger some discussion on this topic.
This blog is containing my own ideas and opinions – it is NOT reflecting opinion of my employer in any way.
Concept of Multitenant Database Containers (MDC)
MDC containers are nicely explained in many other materials, blogs and articles – so I will not attempt to document something what is already covered. For explanation I can point for example to SAP HANA Master Guide (page 25) available here: http://help.sap.com/hana_platform
In short – with latest SAP HANA release you are able to create multiple separate database containers which will be listening on specified ports and which will be independent on each other. This brings many benefits against traditional MCOD and MCOS deployment models (see SAP HANA Master Guide mentioned above for definition and explanation).
I would not hesitate to say that this new option might be seen as replacement for MCOD and MCOS making them obsolete and I would not expect big disagreement from the community.
Can SAP HANA be used for multiple customers?
But does this feature really replace virtualization? Can one SAP HANA installation be used by different customers? Is this concept safe enough?
Currently I would be very careful in deploying SAP HANA in this way. By saying this I do not mean it is not possible – all I am trying to say is that extra effort is required before such deployment can be used.
What is my main concern? Typically shared environments are offering very strong separation which is achieved on network level. Customers are really using same infrastructure however this infrastructure is configured in a way that network packet cannot leave from one tenant into another tenant - unless of course this is desired and it is intentionally configured – and even in such case these packets are traveling across one or more firewalls controlling that this traffic is really something expected.
This is very important for security because humans are very creative and they tend to find most unbelievable ways how to break into places which were (until that moment) seen as impenetrable.
Probably all hypervisors (including VMware) are offering this strong separation. Individual Virtual Machines (VMs) are having own IP address and hypervisor is ensuring that packets are delivered only to the particular VM which is expected to receive them.
Issue with SAP HANA being used by multiple customers is that such strong separation on network level is not possible. I have no reason to not trust SAP when they say that SAP HANA is internally multitenant. But I know for sure it is not externally multitenant on network level – it simply cannot be on its own at this phase. It is still one environment accessible by many customers. If customer can connect to one port (associated with their database container) then there is chance he might be able to connect to the other port which is associated with database container of different customer. At least this will happen if no additional actions are taken to secure such setup.
What could be done to improve such setup? Well after talking to different people I found that there are number of ways how you might increase the security of such setup.
For example you might encrypt communication channels and encrypt the storage to make it harder to access the data from other tenants. However this is not blocking the access to the other tenants – it is only making it more difficult.
Another alternative might be to put firewalls around SAP HANA to filter the traffic and to ensure that each tenant is blocked from connecting to ports (representing database containers) that do not belong to given tenant. This might be working solution however it is increasing the costs and overall complexity of such solution. Also might impact bandwidth and latency of particular flows spoiling performance. And last but not least – effort to automate such setup is increasing considerably.
Last area worth mentioning is openness of SAP HANA itself. SAP HANA is not “only” database – it is more than this – it is platform in which we can develop applications. However from security perspective this brings a lot of risks. I am not SAP HANA developer so I might be wrong here (and feel free to correct me here if you think so) but I can imagine smart developer coding application which will allow him to connect to port belonging to another tenant's database container which might be network flow not controlled by firewall because it is on same server.
Bottom line – I am seeing all options above only as obstacles which are making it more difficult for attacker to breach into database containers belonging to other tenants. And honestly at this point I do not know what is the best infrastructure architecture to securely deploy SAP HANA used by different customers.
And this is where I am seeing additional effort required. This is either something individual providers will have to figure out themselves or something where SAP can create reference architecture.
Of course it is obvious that such reference architecture would boost MDC adoption among various providers offering SAP HANA to their customers while absence of it will be strong inhibitor. And since objective of sharing workloads is motivated by intention to decrease the costs this will in turn impact adoption of SAP HANA itself.
Is there any smarter solution?
I am seeing approach I described above as very complex and not very transparent and I believe there is better option however SAP would have to step into the new area where they are not yet operating. This act might also have some drawbacks which were described in following blog: http://diginomica.com/2013/12/20/multi-tenant-multi-instance-saas-spectrum
Here I would like to outline that all descriptions below are my own speculations of what could be done to make SAP HANA truly multitenant. This is NOT what SAP suggested that they plan to do or what is on their roadmap.
In my opinion major improvement would be if SAP HANA would adopt some principles from virtualization – in particular Software Defined Networking (SDN) approach. There is no need to virtualize complete VMs – it would be sufficient to allow SAP HANA to associate each database container with its own IP address and then ensure routing of particular network packets to the right destination. In short it would be doing similar network service VMware hypervisor is doing to individual VMs.
On top of this SAP HANA would need similar options like VMware to define internal routing so that it is clear which database containers inside one SAP HANA installation belong to the same customer and are allowed to see each other and which are blocked on internal network level (inside SAP HANA instance).
Why is this critical? Because if done properly it will push down the separation between tenants to the network level (although virtualized) ensuring that no breach could happen on application level – and all this without the need to build overcomplicated network architectures.
It would also enable additional features which are seen in virtualized world (here I will use VMware as reference). I can imagine having similar features like vMotion – where database container could be moved to another node without any disruption – as IP address would remain same it could be stateful move completely transparent to external applications. Feature like VMware Distributed Resource Scheduler (DRS) where SAP HANA itself could relocate containers based on actual utilization and respecting preconfigured rules. Or features like VMware Fault Tolerance where container would be internally redundant preventing any disruption in case that one node will fail.
All this could be complemented by allowing differences in revisions on individual nodes – which could help to ensure that any updates are completely without any downtime – where nodes will be upgraded one by one and containers would be relocated without any disruption to the node with latest revision.
In summary such step might open quite a lot of options for SAP HANA where SAP HANA would become virtualization layer on its own – completely separating application (database container) from underlying infrastructure (hardware, OS, instance processes).
Summary
To summarize – I believe that SAP HANA Multitenant Database Containers (MDC) are interesting option how to consolidate workloads for large customers having multiple SAP HANA landscapes or in other similar situations where strong separation is not that critical.
On the other hand I am not yet convinced that SAP HANA MDC containers can be used (at least not out-of-the box) as shared solution for different customers on same SAP HANA installation. It might be possible – but not without very careful assessment of risks and creation of infrastructure architecture which would ensure full separation of individual clients.
I do not know how much is SAP ambitious with SAP HANA and if they really intend to turn SAP HANA into fully multitenant software but I am curious to see how things will develop with next releases of SAP HANA.
No comments:
Post a Comment