Subscribe
Accepted Solution

Cinder upload-to-image sometimes fails when NFS File Handle Cache is enabled

[ Edited ]

Hello. We are using cDOT8.3P2 and OpenStack Mitaka on RHEL7.2 (Using RHOSP9 to build it).

 

Cinder upload-to-image sometimes fails if we create the instance from boot volume and trying to upload it. When it fails, image uploading will be canceled silently as below.

Spoiler
$ glance image-list
+--------------------------------------+------------------------+
| ID                                   | Name                   |
+--------------------------------------+------------------------+
| 3813c31c-af2d-4d99-96f5-f9d4a008f48b | centos6.7_20161011     |
| 70166463-1c1d-43d5-962e-c5d6b3ec5f6f | centos7.1_20161011     |
| ff1ffbd9-433d-4d41-9eff-16fe1db76a60 | rhel6.7_20161011       |
| 7d4d369d-f20d-46a7-85d7-6baf45391273 | rhel7.1_20161011       |
| a14fa9e0-49e2-4f03-aa98-5dd7e2eb6bf7 | windows2012R2_20161011 |
+--------------------------------------+------------------------+
$
$ cinder create --image-id 7d4d369d-f20d-46a7-85d7-6baf45391273 --display-name test-sysdisk01 --volume-type nfs_sys1 40
+------------------------------+--------------------------------------+
|           Property           |                Value                 |
+------------------------------+--------------------------------------+
|         attachments          |                  []                  |
|      availability_zone       |                 nova                 |
|           bootable           |                false                 |
|     consistencygroup_id      |                 None                 |
|          created_at          |      2016-10-24T07:42:53.000000      |
|         description          |                 None                 |
|          encrypted           |                False                 |
|              id              | c3291fd4-6880-492e-be08-2dd5bdf3753e |
|           metadata           |                  {}                  |
|         multiattach          |                False                 |
|             name             |            test-sysdisk01            |
| os-vol-tenant-attr:tenant_id |   b80a1a7a8b9d4a70b95ec205255fb1e6   |
|      replication_status      |               disabled               |
|             size             |                  40                  |
|         snapshot_id          |                 None                 |
|         source_volid         |                 None                 |
|            status            |               creating               |
|          updated_at          |      2016-10-24T07:42:54.000000      |
|           user_id            |   f5b7c187fc4648768a9c8d38c3bae2fb   |
|         volume_type          |               nfs_sys1               |
+------------------------------+--------------------------------------+
$
$ cinder list
+--------------------------------------+-----------+----------------------+------+-------------+----------+--------------------------------------+
|                  ID                  |   Status  |         Name         | Size | Volume Type | Bootable |             Attached to              |
+--------------------------------------+-----------+----------------------+------+-------------+----------+--------------------------------------+
| c3291fd4-6880-492e-be08-2dd5bdf3753e | available |    test-sysdisk01    |  40  |   nfs_sys1  |   true   |                                      |
+--------------------------------------+-----------+----------------------+------+-------------+----------+--------------------------------------+
$
$ nova boot --flavor small.linux --boot-volume c3291fd4-6880-492e-be08-2dd5bdf3753e --availability-zone az-1 test-instance-01
+--------------------------------------+--------------------------------------------------+
| Property                             | Value                                            |
+--------------------------------------+--------------------------------------------------+
| OS-DCF:diskConfig                    | MANUAL                                           |
| OS-EXT-AZ:availability_zone          | az-1                                           |
| OS-EXT-STS:power_state               | 0                                                |
| OS-EXT-STS:task_state                | scheduling                                       |
| OS-EXT-STS:vm_state                  | building                                         |
| OS-SRV-USG:launched_at               | -                                                |
| OS-SRV-USG:terminated_at             | -                                                |
| accessIPv4                           |                                                  |
| accessIPv6                           |                                                  |
| adminPass                            | TrQme29ZrBy2                                     |
| config_drive                         |                                                  |
| created                              | 2016-10-24T07:43:13Z                             |
| flavor                               | small.linux (11)                                 |
| hostId                               |                                                  |
| id                                   | f9d3ef82-73ad-4db6-8858-27af6761df95             |
| image                                | Attempt to boot from volume - no image supplied  |
| key_name                             | -                                                |
| metadata                             | {}                                               |
| name                                 | test-instance-01                                 |
| os-extended-volumes:volumes_attached | [{"id": "c3291fd4-6880-492e-be08-2dd5bdf3753e"}] |
| progress                             | 0                                                |
| security_groups                      | default                                          |
| status                               | BUILD                                            |
| tenant_id                            | b80a1a7a8b9d4a70b95ec205255fb1e6                 |
| updated                              | 2016-10-24T07:43:13Z                             |
| user_id                              | f5b7c187fc4648768a9c8d38c3bae2fb                 |
+--------------------------------------+--------------------------------------------------+
$
$ nova list
+--------------------------------------+--------------------+--------+------------+-------------+--------------------------------+
| ID                                   | Name               | Status | Task State | Power State | Networks                       |
+--------------------------------------+--------------------+--------+------------+-------------+--------------------------------+
| f9d3ef82-73ad-4db6-8858-27af6761df95 | test-instance-01   | ACTIVE | -          | Running     | iaas-ut-network-1=192.168.10.6 |
+--------------------------------------+--------------------+--------+------------+-------------+--------------------------------+
$
$ nova stop test-instance-01
Request to stop server test-instance-01 has been accepted.
$
$ nova list
+--------------------------------------+--------------------+---------+------------+-------------+--------------------------------+
| ID                                   | Name               | Status  | Task State | Power State | Networks                       |
+--------------------------------------+--------------------+---------+------------+-------------+--------------------------------+
| f9d3ef82-73ad-4db6-8858-27af6761df95 | test-instance-01   | SHUTOFF | -          | Shutdown    | iaas-ut-network-1=192.168.10.6 |
+--------------------------------------+--------------------+---------+------------+-------------+--------------------------------+
$
$ cinder upload-to-image --force True c3291fd4-6880-492e-be08-2dd5bdf3753e test-image
+---------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|       Property      |                                                                                                                                       Value                                                                                                                                        |
+---------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|   container_format  |                                                                                                                                        bare                                                                                                                                        |
|     disk_format     |                                                                                                                                        raw                                                                                                                                         |
| display_description |                                                                                                                                        None                                                                                                                                        |
|          id         |                                                                                                                        c3291fd4-6880-492e-be08-2dd5bdf3753e                                                                                                                        |
|       image_id      |                                                                                                                        9f108b49-1121-4912-bec5-e843b3dd5e66                                                                                                                        |
|      image_name     |                                                                                                                                     test-image                                                                                                                                     |
|         size        |                                                                                                                                         40                                                                                                                                         |
|        status       |                                                                                                                                     uploading                                                                                                                                      |
|      updated_at     |                                                                                                                             2016-10-24T07:43:15.000000                                                                                                                             |
|     volume_type     | {u'description': None, u'deleted': False, u'created_at': u'2016-10-11T04:57:04.000000', u'updated_at': None, u'extra_specs': {u'volume_backend_name': u'nfs_sys1'}, u'is_public': True, u'deleted_at': None, u'id': u'8625279f-1cdb-4b4c-8638-1e238974f408', u'name': u'nfs_sys1'} |
+---------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
$
$ glance image-list
+--------------------------------------+------------------------+
| ID                                   | Name                   |
+--------------------------------------+------------------------+
| 3813c31c-af2d-4d99-96f5-f9d4a008f48b | centos6.7_20161011     |
| 70166463-1c1d-43d5-962e-c5d6b3ec5f6f | centos7.1_20161011     |
| ff1ffbd9-433d-4d41-9eff-16fe1db76a60 | rhel6.7_20161011       |
| 7d4d369d-f20d-46a7-85d7-6baf45391273 | rhel7.1_20161011       |
| a14fa9e0-49e2-4f03-aa98-5dd7e2eb6bf7 | windows2012R2_20161011 |
+--------------------------------------+------------------------+
$
$

Error message is shown in the volume.log as below.

2016-10-24 16:44:43.581 4588 ERROR oslo_messaging.rpc.dispatcher [req-644c0aae-60d0-455e-bde3-905bf0bd8e95 f5b7c187fc4648768a9c8d38c3bae2fb b80a1a7a8b9d4a70b95ec205255fb1e6 - - -] Exception during message handling: [Errno 13] Permission denied: u'/var/lib/cinder/mnt/e9e73a2197c2a9769d72983242b91dd1/volume-c3291fd4-6880-492e-be08-2dd5bdf3753e'
2016-10-24 16:44:43.581 4588 ERROR oslo_messaging.rpc.dispatcher Traceback (most recent call last):
2016-10-24 16:44:43.581 4588 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 138, in _dispatch_and_reply
2016-10-24 16:44:43.581 4588 ERROR oslo_messaging.rpc.dispatcher     incoming.message))
2016-10-24 16:44:43.581 4588 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 183, in _dispatch
2016-10-24 16:44:43.581 4588 ERROR oslo_messaging.rpc.dispatcher     return self._do_dispatch(endpoint, method, ctxt, args)
2016-10-24 16:44:43.581 4588 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 127, in _do_dispatch
2016-10-24 16:44:43.581 4588 ERROR oslo_messaging.rpc.dispatcher     result = func(ctxt, **new_args)
2016-10-24 16:44:43.581 4588 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/site-packages/cinder/volume/manager.py", line 1320, in copy_volume_to_image
2016-10-24 16:44:43.581 4588 ERROR oslo_messaging.rpc.dispatcher     payload['message'] = six.text_type(error)
2016-10-24 16:44:43.581 4588 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
2016-10-24 16:44:43.581 4588 ERROR oslo_messaging.rpc.dispatcher     self.force_reraise()
2016-10-24 16:44:43.581 4588 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
2016-10-24 16:44:43.581 4588 ERROR oslo_messaging.rpc.dispatcher     six.reraise(self.type_, self.value, self.tb)
2016-10-24 16:44:43.581 4588 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/site-packages/cinder/volume/manager.py", line 1306, in copy_volume_to_image
2016-10-24 16:44:43.581 4588 ERROR oslo_messaging.rpc.dispatcher     image_service, image_meta)
2016-10-24 16:44:43.581 4588 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/site-packages/cinder/volume/drivers/remotefs.py", line 428, in copy_volume_to_image
2016-10-24 16:44:43.581 4588 ERROR oslo_messaging.rpc.dispatcher     self.local_path(volume))
2016-10-24 16:44:43.581 4588 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/site-packages/cinder/image/image_utils.py", line 377, in upload_volume
2016-10-24 16:44:43.581 4588 ERROR oslo_messaging.rpc.dispatcher     with open(volume_path) as image_file:
2016-10-24 16:44:43.581 4588 ERROR oslo_messaging.rpc.dispatcher IOError: [Errno 13] Permission denied: u'/var/lib/cinder/mnt/e9e73a2197c2a9769d72983242b91dd1/volume-c3291fd4-6880-492e-be08-2dd5bdf3753e'

However, when we disabled NFS File Handle Cache by editing cinder.conf, the issue was solved.

 

-nfs_mount_options=lookupcache=pos
+nfs_mount_options=lookupcache=none

We got the idea for this solution from these articles.
https://bugs.launchpad.net/cinder/+bug/1608596
https://community.netapp.com/t5/OpenStack-Discussions/Create-volume-from-image-cache-became-slow-in-cinder-2015-1-3/m-p/121082

 

I think that cinder dataontap driver (or nova?) can't update NFS cache correctly. So, cinder driver gets the old information for permissions of the volume before uploading and it fails to access the volume.

 

Is this a bug? or shoud we disable NFS File Handle Cache when we use NetApp?

 

 

Thanks,
Iwao Endo.

Re: Cinder upload-to-image sometimes fails when NFS File Handle Cache is enabled

Hello Iwao,

 

Are you using NFS v4 ?

 

Best Regards,

Bishoy

Re: Cinder upload-to-image sometimes fails when NFS File Handle Cache is enabled

Thanks Bishoy,

 

> Are you using NFS v4 ?

Yes, we are using NFS v4.

 

We mount a FlexVolume as below.

*.*.*.*:/cinder on /var/lib/cinder/mnt/e9e73a2197c2a9769d72983242b91dd1 type nfs4 (rw,relatime,vers=4.1,rsize=65536,wsize=65536,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=*.*.*.*,lookupcache=none,local_lock=none,addr=*.*.*.*)

We enable NFS options as below.

cluster::> vserver nfs show -instance

Vserver: svm01
General NFS Access: true
NFS v3: enabled
NFS v4.0: enabled
UDP Protocol: enabled
TCP Protocol: enabled
Default Windows User: -
NFSv4.0 ACL Support: disabled
NFSv4.0 Read Delegation Support: disabled
NFSv4.0 Write Delegation Support: disabled
NFSv4 ID Mapping Domain: defaultv4iddomain.com
NFSv4 Grace Timeout Value (in secs): 45
Preserves and Modifies NFSv4 ACL (and NTFS File Permissions in Unified Security Style): enabled
NFSv4.1 Minor Version Support: enabled
Rquota Enable: disabled
NFSv4.1 Parallel NFS Support: disabled
NFSv4.1 ACL Support: disabled
NFS vStorage Support: enabled
NFSv4 Support for Numeric Owner IDs: enabled
Default Windows Group: -
NFSv4.1 Read Delegation Support: disabled
NFSv4.1 Write Delegation Support: disabled
NFS Mount Root Only: enabled
NFS Root Only: disabled
Permitted Kerberos Encryption Types: des, des3, aes-128, aes-256
Showmount Enabled: disabled
Set the Protocol Used for Name Services Lookups for Exports: udp

  

Best Regards,

Iwao Endo

Re: Cinder upload-to-image sometimes fails when NFS File Handle Cache is enabled

Hi Iwao,

 

I have seen this before and it looks to me like NFSv4 configuration.

 

If this is not production, will you be able to use NFS3 with the positive cache enabled then re-test? it's better than looking through the loads of nfs4 config.

 

If it works then we can look into nfs4 config

 

 

Best regards,

Bishoy 

Highlighted

Re: Cinder upload-to-image sometimes fails when NFS File Handle Cache is enabled

seems there is more than cache issue here. remotefs and image_utils shouldn't be invoked unless you are using qcow images, are you?

We support RAW ones on the NetApp driver(covered in other dicussions)