Generate and capture the sfs_dump data:
In this section you will need to know how to connect to your Data Domain via a ssh client, and how to configure its ability to log any data which appears on the screen. This log file is not written on the Data Domain, it will be written to the system from which you are running the ssh client (perhaps your laptop). You should ensure there is sufficient space for the log file (which could be around 200 MB for systems with 1 million files).
1. Login to the DDR from
a ssh client
2. Type system show
serialno
3. Type priv set se
4. Authenticate using the
serial number from Step 2. If your system uses retention-lock compliance then
you will need to authenticate using a user with the "security" role
first.
5. Switch on logging
within your ssh client, ensure "all session output" is enabled, the
output file is named logically and will be placed in a suitable directory
where you can find it. For PuTTY, this is under the Session -> Logging tab
of the configuration screen. You may need to check the manual if you are
using a different ssh client.
6. Type se sfs_dump -c
(you can specify a path if you want to limit to a particular dir or mtree, eg
se sfs_dump -c /data/col1/mymtree.)
7. When the command has
completed, switch off logging. If Excel is not installed on the same machine
as the log file just created then you will need to copy it across.
Import into Excel:
In this section the previously created log file will be opened inside Excel, and the data imported into a spreadsheet. Excel's "Text Import Wizard" will be configured to split the data into individual columns. Any redundant information above the actual data will be removed.
8. Start Excel
9. Open the file created
by your ssh client's logging, using File -> Open. You may need
to set the dialog to show "All files", depending on the suffix of
the log file you created.
10. Excel will start the
"Text Import Wizard"
11. Choose "Delimited"
to describe the data
12. Scroll through the
preview, looking for a line which starts with
"namemtimefileidsizeseg_bytes" and note the row number. This line
is the header for the sfs_dump data and is the first line we wish to import.
Set the "Start import at row" value to this number.
13. Click "Next"
14. Ensure the
"Tab" delimiter box is ticked and the others are unticked, check
the preview to see filenames in the first column, and numbers in the
remaining columns. Note: You may need to scroll down a
little as the preview starts at the top of the file, rather than the header
line set in Step 12. If the preview does not show filenames in the first
column, and numbers in the remaining columns then you should click "Back"
and follow Steps 11-14 again. If it still doesn't appear correct then you
will need to restart from Step 1 - paying particular attention to ensuring
that "all session output" is enabled. This is essential to allow
the tab characters used in the sfs_dump output to be saved in the log file.
15. Click "Next"
16. Click "Finish"
17. At this point, Excel
should contain data in columns A to I. If data is only present in column A,
but stretches across the screen, then close Excel and retry from Step 8. If
you still see the same result, re-read Step 14 and then restart from Step 1.
Add extra calculated fields:
In this section, extra fields will be added to show various compression values, and to convert the timestamp data into a human readable value. We will also remove any redundant information below the data. When you first enter the formulae in the following steps, the cell will display "#VALUE!" - this is normal and expected. Firstly, the timestamp conversion.
18. Copy the following
formula and paste into cell J1: =(((B1/1000000000+(-6*3600))/86400)+25569)
a) Change the -6 in the
-6*3600 section to the offset between your timezone and UTC, ie -6 for
Central, -5 for Eastern, -8 for Pacific. For timezones East of UTC (ie
mainland Europe, India) the offset will be positive (ie 1 for Central
Europe). If your timezone has a half hour difference, make sure you include
.5 in the value you use.
b) Secondly, the
compression information.
19. Copy the following
formula and paste into cell K1: =IF(H1=0,0,D1/H1)
20. Copy the following
formula and paste into cell L1: =IF(I1=0,0,H1/I1)
21. Copy the following
formula and paste into cell M1: =K1*L1
22. Select cell J1, right
click, then "format cell", choose your preferred format from the
Date selection.
23. Select cells K1,L1,M1,
right click, then "format cell", set to Number with one decimal
place
Next, we are going to copy the formulae we have just added, to every row
of data, using some useful shortcuts.
24. Select cells
J1,K1,L1,M1, right click, select copy
25. Scroll down to the
bottom of the data using the vertical scrollbar on the right (which is
quicker than using the arrow keys.) Look for the last row which
contains data in columns A through to I. Take a note of the row number. There
should be a blank row below this one, followed by some summary information.
26. Select the cells in
columns J,K,L,M in the row for which you noted its row number, then scroll
back to the top of the sheet (drag the vertical slider to save time), hold
down shift and select cells J2,K2,L2,M2. Now press enter so that the values
are pasted all the way down. Now we are going to remove any redundant rows
below the data.
27. Return to the bottom
of the data again, to the row number recorded in step 25.
28. Select any entire row
underneath this row which contains data, and delete it.
Add remaining headers and filtering functionality:
In this section, we will name the calculated columns we added, and enable Excel's powerful filtering and sorting ability.
29. Go to cell J1, change
this to read "Last written date"
30. Change cell K1 to read
"Global compression"
31. Change cell L1 to read
"Local compression"
32. Change cell M1 to read
"Overall compression"
33. Double click on the
divider between the headers for column A and B to resize column A to the
widest filename.
34. Select cell A1.
Select "Format as Table" from the "Home" ribbon
menu. Click on your preferred style - any from the 2nd line under the
"Medium" heading have been found to be easy to read. Ensure
"My table has headers" is ticked, then click "OK" in the
"Format as Table" dialog.
35. "Save as",
ensuring the type is set to "Excel Workbook (*.xlsx)
Information on the columns:
·
A contains filename
·
D is the original size of the file
·
H is the size after global compression
·
I is the size after global and local compression
·
J contains the datestamp for when the file was last written..
·
K contains the global compression or dedupe
·
L contains the local compression
·
M contains the overall compression for the file
All filesizes are in bytes.
You can now use Excel to filter/sort as required to report on your data.
·
For example, to show only files within the MTree
"/data/col1/backup" which are over 30 days old:
·
Click on the down arrow in the "Name" header, select
"Text Filters", then "Begins With" and type
"/data/col1/backup/" into the box. The trailing slash is important.
Click "OK"
·
Click on the down arrow in the "Last Written Date" Header",
select "Date Filters", then "Before".
Use the date picker on the right of the dialog to select a date 30 days
ago. Click "OK"
·
The status bar at the bottom will show how many rows match this
selection.
|
Notes:
This procedure relies on the Data Domain's sfs_dump command.
The information returned from this command includes a timestamp of the file (in
a non-human readable format) and internal information related to deduplication.
This information is only stored when then the file is first written to the Data
Domain, hence it tends to become less accurate as other data is written
to/deleted from the Data Domain.
Please note this particular procedure uses the "-c" option of sfs_dump, which outputs in columns which are separated by tabs. It is essential that the tab characters present in the output are saved in the logging file, otherwise Excel will not be able to separate the fields correctly. There are other output formats for sfs_dump, and EMC Support may have requested these in other situations.
The instructions given assume a reasonable knowledge of Excel, but may be translated into other spreadsheet software.
Modern versions of Excel allow a large number of rows (bear in mind that each file on your system will require a row in the spreadsheet.)
If your system has too many files for your version of Excel you can try running the procedure once for each Mtree, so you will have multiple spreadsheets for the system.
Please note this particular procedure uses the "-c" option of sfs_dump, which outputs in columns which are separated by tabs. It is essential that the tab characters present in the output are saved in the logging file, otherwise Excel will not be able to separate the fields correctly. There are other output formats for sfs_dump, and EMC Support may have requested these in other situations.
The instructions given assume a reasonable knowledge of Excel, but may be translated into other spreadsheet software.
Modern versions of Excel allow a large number of rows (bear in mind that each file on your system will require a row in the spreadsheet.)
If your system has too many files for your version of Excel you can try running the procedure once for each Mtree, so you will have multiple spreadsheets for the system.
How to import sfs_dump data into Excel?
Hiç yorum yok:
Yorum Gönder