[PATCH 00/32] xfs: current queue for 3.8

classic Classic list List threaded Threaded
84 messages Options
12345
Reply | Threaded
Open this post in threaded view
|

[PATCH 00/32] xfs: current queue for 3.8

Dave Chinner
Hi folks,

This is my current patch queue for the 3.8 merge window. We are now
getting close to the window opening (at -rc5 now), so I'd really
like to see this stuff into the dev tree ASAP so that there is some
wider test coverage before the merge window comes along.

The bulk of this patch series has been reviewed and revised over the
past month. The only new patch in this is the additional attribute
trace points that I needed to track down the corruption problem I
recently fixed.

Other than that, I've reordered the patches to make growfs use
uncached buffers ahead of the verifier series and rebased the
verifier series on top of it. i also folded the fixes I had in
additional patches back into the base patches in the verifier
series.

I'm not sure whether I have captured all the Reviewed-by tags that
people have given - if necessary I can go back and search the lists
for them all and add the ones I've missed....

Diffstat for the series is:

$ git diff --stat --summary -C -M 074dad5..f02d23b
 fs/xfs/Kconfig            |    1 +
 fs/xfs/Makefile           |    1 -
 fs/xfs/uuid.h             |    6 +
 fs/xfs/xfs_ag.h           |    4 +
 fs/xfs/xfs_alloc.c        |  141 ++++++++++++---
 fs/xfs/xfs_alloc.h        |    3 +
 fs/xfs/xfs_alloc_btree.c  |   77 +++++++++
 fs/xfs/xfs_alloc_btree.h  |    2 +
 fs/xfs/xfs_aops.c         |    2 +-
 fs/xfs/xfs_attr.c         |  103 +++++------
 fs/xfs/xfs_attr_leaf.c    |  143 ++++++++++------
 fs/xfs/xfs_attr_leaf.h    |    6 +
 fs/xfs/xfs_bmap.c         |   64 ++++---
 fs/xfs/xfs_bmap_btree.c   |   63 +++++++
 fs/xfs/xfs_bmap_btree.h   |    1 +
 fs/xfs/xfs_btree.c        |  111 +++++++-----
 fs/xfs/xfs_btree.h        |   22 ++-
 fs/xfs/xfs_buf.c          |   59 +++++--
 fs/xfs/xfs_buf.h          |   27 ++-
 fs/xfs/xfs_cksum.h        |   63 +++++++
 fs/xfs/xfs_da_btree.c     |  141 ++++++++++++---
 fs/xfs/xfs_da_btree.h     |   10 +-
 fs/xfs/xfs_dfrag.c        |   13 +-
 fs/xfs/xfs_dir2_block.c   |  436 +++++++++++++++++++++++++++--------------------
 fs/xfs/xfs_dir2_data.c    |  170 ++++++++++++++----
 fs/xfs/xfs_dir2_leaf.c    |  172 +++++++++++++------
 fs/xfs/xfs_dir2_node.c    |  288 ++++++++++++++++++++-----------
 fs/xfs/xfs_dir2_priv.h    |   19 ++-
 fs/xfs/xfs_dquot.c        |  135 ++++++++++++---
 fs/xfs/xfs_file.c         |   27 +--
 fs/xfs/xfs_fs_subr.c      |   96 -----------
 fs/xfs/xfs_fsops.c        |  137 ++++++++++-----
 fs/xfs/xfs_ialloc.c       |   74 +++++---
 fs/xfs/xfs_ialloc.h       |    4 +-
 fs/xfs/xfs_ialloc_btree.c |   55 ++++++
 fs/xfs/xfs_ialloc_btree.h |    2 +
 fs/xfs/xfs_inode.c        |  131 ++++++++------
 fs/xfs/xfs_inode.h        |    1 +
 fs/xfs/xfs_iops.c         |    4 +-
 fs/xfs/xfs_itable.c       |    3 +-
 fs/xfs/xfs_linux.h        |    1 +
 fs/xfs/xfs_log.c          |  135 ++++++++++++---
 fs/xfs/xfs_log_priv.h     |   11 +-
 fs/xfs/xfs_log_recover.c  |  145 ++++++++--------
 fs/xfs/xfs_mount.c        |  130 +++++++++-----
 fs/xfs/xfs_mount.h        |    4 +-
 fs/xfs/xfs_qm.c           |    5 +-
 fs/xfs/xfs_rtalloc.c      |   15 +-
 fs/xfs/xfs_sb.h           |   10 +-
 fs/xfs/xfs_trace.h        |   54 +++++-
 fs/xfs/xfs_trans.h        |   19 +--
 fs/xfs/xfs_trans_buf.c    |    9 +-
 fs/xfs/xfs_vnodeops.c     |   48 ++++--
 fs/xfs/xfs_vnodeops.h     |    7 -
 54 files changed, 2327 insertions(+), 1083 deletions(-)
 create mode 100644 fs/xfs/xfs_cksum.h
 delete mode 100644 fs/xfs/xfs_fs_subr.c

It seems pretty solid - all the bug fixes I've been pushing out
recently have been found as a result of testing this patch series.
They have started life at the end of the series, and once confirmed
to fix the problem have been re-ordered to the start. Hence the
series has been seeing all the testing I have been doing recently.

I really do not want this stuff to miss the 3.8 window due
to a repeat of the last cycle's misadventures. Given how quiet -rc5
was, we might only be 2 weeks away from the 3.8 merge window
opening. Which means that, realistically, this series need to be
finalised by the end of the week so that it's got some soak time in
linux-next before it moves into Linus' tree.

The main reason I don't want this to miss 3.8 is that I'm planning
on 3.9 for all the CRC metadata format changes and supporting code
to be ready. There's a lot more code for coming for 3.9 than there
is in this patch series (probably twice the size) and it's a lot
more complex, so the less that ends up in 3.9 from this series the
better...

Cheers,

Dave.

_______________________________________________
xfs mailing list
[hidden email]
http://oss.sgi.com/mailman/listinfo/xfs
Reply | Threaded
Open this post in threaded view
|

[PATCH 01/32] xfs: add more attribute tree trace points.

Dave Chinner
From: Dave Chinner <[hidden email]>

Added when debugging recent attribute tree problems to more finely
trace code execution through the maze of twisty passages that makes
up the attr code.

Signed-off-by: Dave Chinner <[hidden email]>
---
 fs/xfs/xfs_attr.c      |   18 ++++++++++++++++
 fs/xfs/xfs_attr_leaf.c |   37 +++++++++++++++++++--------------
 fs/xfs/xfs_da_btree.c  |    6 ++++++
 fs/xfs/xfs_trace.h     |   54 +++++++++++++++++++++++++++++++++++++++++++++++-
 4 files changed, 99 insertions(+), 16 deletions(-)

diff --git a/fs/xfs/xfs_attr.c b/fs/xfs/xfs_attr.c
index 0ca1f0b..55bbe98 100644
--- a/fs/xfs/xfs_attr.c
+++ b/fs/xfs/xfs_attr.c
@@ -1155,6 +1155,8 @@ xfs_attr_leaf_get(xfs_da_args_t *args)
  struct xfs_buf *bp;
  int error;
 
+ trace_xfs_attr_leaf_get(args);
+
  args->blkno = 0;
  error = xfs_da_read_buf(args->trans, args->dp, args->blkno, -1, &bp,
      XFS_ATTR_FORK);
@@ -1185,6 +1187,8 @@ xfs_attr_leaf_list(xfs_attr_list_context_t *context)
  int error;
  struct xfs_buf *bp;
 
+ trace_xfs_attr_leaf_list(context);
+
  context->cursor->blkno = 0;
  error = xfs_da_read_buf(NULL, context->dp, 0, -1, &bp, XFS_ATTR_FORK);
  if (error)
@@ -1653,6 +1657,8 @@ xfs_attr_fillstate(xfs_da_state_t *state)
  xfs_da_state_blk_t *blk;
  int level;
 
+ trace_xfs_attr_fillstate(state->args);
+
  /*
  * Roll down the "path" in the state structure, storing the on-disk
  * block number for those buffers in the "path".
@@ -1699,6 +1705,8 @@ xfs_attr_refillstate(xfs_da_state_t *state)
  xfs_da_state_blk_t *blk;
  int level, error;
 
+ trace_xfs_attr_refillstate(state->args);
+
  /*
  * Roll down the "path" in the state structure, storing the on-disk
  * block number for those buffers in the "path".
@@ -1755,6 +1763,8 @@ xfs_attr_node_get(xfs_da_args_t *args)
  int error, retval;
  int i;
 
+ trace_xfs_attr_node_get(args);
+
  state = xfs_da_state_alloc();
  state->args = args;
  state->mp = args->dp->i_mount;
@@ -1804,6 +1814,8 @@ xfs_attr_node_list(xfs_attr_list_context_t *context)
  int error, i;
  struct xfs_buf *bp;
 
+ trace_xfs_attr_node_list(context);
+
  cursor = context->cursor;
  cursor->initted = 1;
 
@@ -1959,6 +1971,8 @@ xfs_attr_rmtval_get(xfs_da_args_t *args)
  int nmap, error, tmp, valuelen, blkcnt, i;
  xfs_dablk_t lblkno;
 
+ trace_xfs_attr_rmtval_get(args);
+
  ASSERT(!(args->flags & ATTR_KERNOVAL));
 
  mp = args->dp->i_mount;
@@ -2014,6 +2028,8 @@ xfs_attr_rmtval_set(xfs_da_args_t *args)
  xfs_dablk_t lblkno;
  int blkcnt, valuelen, nmap, error, tmp, committed;
 
+ trace_xfs_attr_rmtval_set(args);
+
  dp = args->dp;
  mp = dp->i_mount;
  src = args->value;
@@ -2143,6 +2159,8 @@ xfs_attr_rmtval_remove(xfs_da_args_t *args)
  xfs_dablk_t lblkno;
  int valuelen, blkcnt, nmap, error, done, committed;
 
+ trace_xfs_attr_rmtval_remove(args);
+
  mp = args->dp->i_mount;
 
  /*
diff --git a/fs/xfs/xfs_attr_leaf.c b/fs/xfs/xfs_attr_leaf.c
index 70eec18..4bfc732 100644
--- a/fs/xfs/xfs_attr_leaf.c
+++ b/fs/xfs/xfs_attr_leaf.c
@@ -57,7 +57,8 @@ STATIC int xfs_attr_leaf_create(xfs_da_args_t *args, xfs_dablk_t which_block,
  struct xfs_buf **bpp);
 STATIC int xfs_attr_leaf_add_work(struct xfs_buf *leaf_buffer,
   xfs_da_args_t *args, int freemap_index);
-STATIC void xfs_attr_leaf_compact(xfs_trans_t *tp, struct xfs_buf *leaf_buffer);
+STATIC void xfs_attr_leaf_compact(struct xfs_da_args *args,
+  struct xfs_buf *leaf_buffer);
 STATIC void xfs_attr_leaf_rebalance(xfs_da_state_t *state,
    xfs_da_state_blk_t *blk1,
    xfs_da_state_blk_t *blk2);
@@ -1071,7 +1072,7 @@ xfs_attr_leaf_add(
  * Compact the entries to coalesce free space.
  * This may change the hdr->count via dropping INCOMPLETE entries.
  */
- xfs_attr_leaf_compact(args->trans, bp);
+ xfs_attr_leaf_compact(args, bp);
 
  /*
  * After compaction, the block is guaranteed to have only one
@@ -1102,6 +1103,8 @@ xfs_attr_leaf_add_work(
  xfs_mount_t *mp;
  int tmp, i;
 
+ trace_xfs_attr_leaf_add_work(args);
+
  leaf = bp->b_addr;
  ASSERT(leaf->hdr.info.magic == cpu_to_be16(XFS_ATTR_LEAF_MAGIC));
  hdr = &leaf->hdr;
@@ -1214,15 +1217,17 @@ xfs_attr_leaf_add_work(
  */
 STATIC void
 xfs_attr_leaf_compact(
- struct xfs_trans *trans,
- struct xfs_buf *bp)
+ struct xfs_da_args *args,
+ struct xfs_buf *bp)
 {
- xfs_attr_leafblock_t *leaf_s, *leaf_d;
- xfs_attr_leaf_hdr_t *hdr_s, *hdr_d;
- xfs_mount_t *mp;
- char *tmpbuffer;
+ xfs_attr_leafblock_t *leaf_s, *leaf_d;
+ xfs_attr_leaf_hdr_t *hdr_s, *hdr_d;
+ struct xfs_trans *trans = args->trans;
+ struct xfs_mount *mp = trans->t_mountp;
+ char *tmpbuffer;
+
+ trace_xfs_attr_leaf_compact(args);
 
- mp = trans->t_mountp;
  tmpbuffer = kmem_alloc(XFS_LBSIZE(mp), KM_SLEEP);
  ASSERT(tmpbuffer != NULL);
  memcpy(tmpbuffer, bp->b_addr, XFS_LBSIZE(mp));
@@ -1345,9 +1350,8 @@ xfs_attr_leaf_rebalance(xfs_da_state_t *state, xfs_da_state_blk_t *blk1,
  max  = be16_to_cpu(hdr2->firstused)
  - sizeof(xfs_attr_leaf_hdr_t);
  max -= be16_to_cpu(hdr2->count) * sizeof(xfs_attr_leaf_entry_t);
- if (space > max) {
- xfs_attr_leaf_compact(args->trans, blk2->bp);
- }
+ if (space > max)
+ xfs_attr_leaf_compact(args, blk2->bp);
 
  /*
  * Move high entries from leaf1 to low end of leaf2.
@@ -1378,9 +1382,8 @@ xfs_attr_leaf_rebalance(xfs_da_state_t *state, xfs_da_state_blk_t *blk1,
  max  = be16_to_cpu(hdr1->firstused)
  - sizeof(xfs_attr_leaf_hdr_t);
  max -= be16_to_cpu(hdr1->count) * sizeof(xfs_attr_leaf_entry_t);
- if (space > max) {
- xfs_attr_leaf_compact(args->trans, blk1->bp);
- }
+ if (space > max)
+ xfs_attr_leaf_compact(args, blk1->bp);
 
  /*
  * Move low entries from leaf2 to high end of leaf1.
@@ -1577,6 +1580,8 @@ xfs_attr_leaf_toosmall(xfs_da_state_t *state, int *action)
  xfs_dablk_t blkno;
  struct xfs_buf *bp;
 
+ trace_xfs_attr_leaf_toosmall(state->args);
+
  /*
  * Check for the degenerate case of the block being over 50% full.
  * If so, it's not worth even looking to see if we might be able
@@ -1702,6 +1707,8 @@ xfs_attr_leaf_remove(
  int tablesize, tmp, i;
  xfs_mount_t *mp;
 
+ trace_xfs_attr_leaf_remove(args);
+
  leaf = bp->b_addr;
  ASSERT(leaf->hdr.info.magic == cpu_to_be16(XFS_ATTR_LEAF_MAGIC));
  hdr = &leaf->hdr;
diff --git a/fs/xfs/xfs_da_btree.c b/fs/xfs/xfs_da_btree.c
index 7bfb7dd..c62e7e6 100644
--- a/fs/xfs/xfs_da_btree.c
+++ b/fs/xfs/xfs_da_btree.c
@@ -779,6 +779,8 @@ xfs_da_node_toosmall(xfs_da_state_t *state, int *action)
  xfs_dablk_t blkno;
  struct xfs_buf *bp;
 
+ trace_xfs_da_node_toosmall(state->args);
+
  /*
  * Check for the degenerate case of the block being over 50% full.
  * If so, it's not worth even looking to see if we might be able
@@ -900,6 +902,8 @@ xfs_da_fixhashpath(xfs_da_state_t *state, xfs_da_state_path_t *path)
  xfs_dahash_t lasthash=0;
  int level, count;
 
+ trace_xfs_da_fixhashpath(state->args);
+
  level = path->active-1;
  blk = &path->blk[ level ];
  switch (blk->magic) {
@@ -1417,6 +1421,8 @@ xfs_da_path_shift(xfs_da_state_t *state, xfs_da_state_path_t *path,
  xfs_dablk_t blkno=0;
  int level, error;
 
+ trace_xfs_da_path_shift(state->args);
+
  /*
  * Roll up the Btree looking for the first block where our
  * current index is not at the edge of the block.  Note that
diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h
index cb52346..2e137d4 100644
--- a/fs/xfs/xfs_trace.h
+++ b/fs/xfs/xfs_trace.h
@@ -96,6 +96,8 @@ DEFINE_ATTR_LIST_EVENT(xfs_attr_list_full);
 DEFINE_ATTR_LIST_EVENT(xfs_attr_list_add);
 DEFINE_ATTR_LIST_EVENT(xfs_attr_list_wrong_blk);
 DEFINE_ATTR_LIST_EVENT(xfs_attr_list_notfound);
+DEFINE_ATTR_LIST_EVENT(xfs_attr_leaf_list);
+DEFINE_ATTR_LIST_EVENT(xfs_attr_node_list);
 
 DECLARE_EVENT_CLASS(xfs_perag_class,
  TP_PROTO(struct xfs_mount *mp, xfs_agnumber_t agno, int refcount,
@@ -1502,8 +1504,42 @@ DEFINE_DIR2_EVENT(xfs_dir2_node_replace);
 DEFINE_DIR2_EVENT(xfs_dir2_node_removename);
 DEFINE_DIR2_EVENT(xfs_dir2_node_to_leaf);
 
+DECLARE_EVENT_CLASS(xfs_attr_class,
+ TP_PROTO(struct xfs_da_args *args),
+ TP_ARGS(args),
+ TP_STRUCT__entry(
+ __field(dev_t, dev)
+ __field(xfs_ino_t, ino)
+ __dynamic_array(char, name, args->namelen)
+ __field(int, namelen)
+ __field(int, valuelen)
+ __field(xfs_dahash_t, hashval)
+ __field(int, op_flags)
+ ),
+ TP_fast_assign(
+ __entry->dev = VFS_I(args->dp)->i_sb->s_dev;
+ __entry->ino = args->dp->i_ino;
+ if (args->namelen)
+ memcpy(__get_str(name), args->name, args->namelen);
+ __entry->namelen = args->namelen;
+ __entry->valuelen = args->valuelen;
+ __entry->hashval = args->hashval;
+ __entry->op_flags = args->op_flags;
+ ),
+ TP_printk("dev %d:%d ino 0x%llx name %.*s namelen %d valuelen %d "
+  "hashval 0x%x op_flags %s",
+  MAJOR(__entry->dev), MINOR(__entry->dev),
+  __entry->ino,
+  __entry->namelen,
+  __entry->namelen ? __get_str(name) : NULL,
+  __entry->namelen,
+  __entry->valuelen,
+  __entry->hashval,
+  __print_flags(__entry->op_flags, "|", XFS_DA_OP_FLAGS))
+)
+
 #define DEFINE_ATTR_EVENT(name) \
-DEFINE_EVENT(xfs_da_class, name, \
+DEFINE_EVENT(xfs_attr_class, name, \
  TP_PROTO(struct xfs_da_args *args), \
  TP_ARGS(args))
 DEFINE_ATTR_EVENT(xfs_attr_sf_add);
@@ -1517,10 +1553,14 @@ DEFINE_ATTR_EVENT(xfs_attr_sf_to_leaf);
 DEFINE_ATTR_EVENT(xfs_attr_leaf_add);
 DEFINE_ATTR_EVENT(xfs_attr_leaf_add_old);
 DEFINE_ATTR_EVENT(xfs_attr_leaf_add_new);
+DEFINE_ATTR_EVENT(xfs_attr_leaf_add_work);
 DEFINE_ATTR_EVENT(xfs_attr_leaf_addname);
 DEFINE_ATTR_EVENT(xfs_attr_leaf_create);
+DEFINE_ATTR_EVENT(xfs_attr_leaf_compact);
+DEFINE_ATTR_EVENT(xfs_attr_leaf_get);
 DEFINE_ATTR_EVENT(xfs_attr_leaf_lookup);
 DEFINE_ATTR_EVENT(xfs_attr_leaf_replace);
+DEFINE_ATTR_EVENT(xfs_attr_leaf_remove);
 DEFINE_ATTR_EVENT(xfs_attr_leaf_removename);
 DEFINE_ATTR_EVENT(xfs_attr_leaf_split);
 DEFINE_ATTR_EVENT(xfs_attr_leaf_split_before);
@@ -1532,12 +1572,21 @@ DEFINE_ATTR_EVENT(xfs_attr_leaf_to_sf);
 DEFINE_ATTR_EVENT(xfs_attr_leaf_to_node);
 DEFINE_ATTR_EVENT(xfs_attr_leaf_rebalance);
 DEFINE_ATTR_EVENT(xfs_attr_leaf_unbalance);
+DEFINE_ATTR_EVENT(xfs_attr_leaf_toosmall);
 
 DEFINE_ATTR_EVENT(xfs_attr_node_addname);
+DEFINE_ATTR_EVENT(xfs_attr_node_get);
 DEFINE_ATTR_EVENT(xfs_attr_node_lookup);
 DEFINE_ATTR_EVENT(xfs_attr_node_replace);
 DEFINE_ATTR_EVENT(xfs_attr_node_removename);
 
+DEFINE_ATTR_EVENT(xfs_attr_fillstate);
+DEFINE_ATTR_EVENT(xfs_attr_refillstate);
+
+DEFINE_ATTR_EVENT(xfs_attr_rmtval_get);
+DEFINE_ATTR_EVENT(xfs_attr_rmtval_set);
+DEFINE_ATTR_EVENT(xfs_attr_rmtval_remove);
+
 #define DEFINE_DA_EVENT(name) \
 DEFINE_EVENT(xfs_da_class, name, \
  TP_PROTO(struct xfs_da_args *args), \
@@ -1556,9 +1605,12 @@ DEFINE_DA_EVENT(xfs_da_node_split);
 DEFINE_DA_EVENT(xfs_da_node_remove);
 DEFINE_DA_EVENT(xfs_da_node_rebalance);
 DEFINE_DA_EVENT(xfs_da_node_unbalance);
+DEFINE_DA_EVENT(xfs_da_node_toosmall);
 DEFINE_DA_EVENT(xfs_da_swap_lastblock);
 DEFINE_DA_EVENT(xfs_da_grow_inode);
 DEFINE_DA_EVENT(xfs_da_shrink_inode);
+DEFINE_DA_EVENT(xfs_da_fixhashpath);
+DEFINE_DA_EVENT(xfs_da_path_shift);
 
 DECLARE_EVENT_CLASS(xfs_dir2_space_class,
  TP_PROTO(struct xfs_da_args *args, int idx),
--
1.7.10

_______________________________________________
xfs mailing list
[hidden email]
http://oss.sgi.com/mailman/listinfo/xfs
Reply | Threaded
Open this post in threaded view
|

[PATCH 02/32] xfs: remove xfs_tosspages

Dave Chinner
In reply to this post by Dave Chinner
From: Dave Chinner <[hidden email]>

It's a buggy, unnecessary wrapper that is duplicating
truncate_pagecache_range().

When replacing the call in xfs_change_file_space(), also ensure that
the length being allocated/freed is always positive before making
any changes. These checks are done in the lower extent manipulation
functions, too, but we need to do them before any page cache
operations.

Reported-by: Andrew Dahl <[hidden email]>
Signed-off-by: Dave Chinner <[hidden email]>
---
 fs/xfs/xfs_dfrag.c    |    3 +--
 fs/xfs/xfs_fs_subr.c  |   12 ------------
 fs/xfs/xfs_vnodeops.c |   28 +++++++++++++++++++++++-----
 fs/xfs/xfs_vnodeops.h |    2 --
 4 files changed, 24 insertions(+), 21 deletions(-)

diff --git a/fs/xfs/xfs_dfrag.c b/fs/xfs/xfs_dfrag.c
index b9b8646..b2c63a2 100644
--- a/fs/xfs/xfs_dfrag.c
+++ b/fs/xfs/xfs_dfrag.c
@@ -315,8 +315,7 @@ xfs_swap_extents(
  * are safe.  We don't really care if non-io related
  * fields change.
  */
-
- xfs_tosspages(ip, 0, -1, FI_REMAPF);
+ truncate_pagecache_range(VFS_I(ip), 0, -1);
 
  tp = xfs_trans_alloc(mp, XFS_TRANS_SWAPEXT);
  if ((error = xfs_trans_reserve(tp, 0,
diff --git a/fs/xfs/xfs_fs_subr.c b/fs/xfs/xfs_fs_subr.c
index 652b875..d49de3d 100644
--- a/fs/xfs/xfs_fs_subr.c
+++ b/fs/xfs/xfs_fs_subr.c
@@ -25,18 +25,6 @@
  * note: all filemap functions return negative error codes. These
  * need to be inverted before returning to the xfs core functions.
  */
-void
-xfs_tosspages(
- xfs_inode_t *ip,
- xfs_off_t first,
- xfs_off_t last,
- int fiopt)
-{
- /* can't toss partial tail pages, so mask them out */
- last &= ~(PAGE_SIZE - 1);
- truncate_inode_pages_range(VFS_I(ip)->i_mapping, first, last - 1);
-}
-
 int
 xfs_flushinval_pages(
  xfs_inode_t *ip,
diff --git a/fs/xfs/xfs_vnodeops.c b/fs/xfs/xfs_vnodeops.c
index c2ddd7a..f7de578 100644
--- a/fs/xfs/xfs_vnodeops.c
+++ b/fs/xfs/xfs_vnodeops.c
@@ -2118,7 +2118,6 @@ xfs_change_file_space(
  xfs_fsize_t fsize;
  int setprealloc;
  xfs_off_t startoffset;
- xfs_off_t llen;
  xfs_trans_t *tp;
  struct iattr iattr;
  int prealloc_type;
@@ -2139,12 +2138,30 @@ xfs_change_file_space(
  return XFS_ERROR(EINVAL);
  }
 
- llen = bf->l_len > 0 ? bf->l_len - 1 : bf->l_len;
+ /*
+ * length of <= 0 for resv/unresv/zero is invalid.  length for
+ * alloc/free is ignored completely and we have no idea what userspace
+ * might have set it to, so set it to zero to allow range
+ * checks to pass.
+ */
+ switch (cmd) {
+ case XFS_IOC_ZERO_RANGE:
+ case XFS_IOC_RESVSP:
+ case XFS_IOC_RESVSP64:
+ case XFS_IOC_UNRESVSP:
+ case XFS_IOC_UNRESVSP64:
+ if (bf->l_len <= 0)
+ return XFS_ERROR(EINVAL);
+ break;
+ default:
+ bf->l_len = 0;
+ break;
+ }
 
  if (bf->l_start < 0 ||
     bf->l_start > mp->m_super->s_maxbytes ||
-    bf->l_start + llen < 0 ||
-    bf->l_start + llen > mp->m_super->s_maxbytes)
+    bf->l_start + bf->l_len < 0 ||
+    bf->l_start + bf->l_len >= mp->m_super->s_maxbytes)
  return XFS_ERROR(EINVAL);
 
  bf->l_whence = 0;
@@ -2169,7 +2186,8 @@ xfs_change_file_space(
  switch (cmd) {
  case XFS_IOC_ZERO_RANGE:
  prealloc_type |= XFS_BMAPI_CONVERT;
- xfs_tosspages(ip, startoffset, startoffset + bf->l_len, 0);
+ truncate_pagecache_range(VFS_I(ip), startoffset,
+ round_down(startoffset + bf->l_len, PAGE_SIZE) - 1);
  /* FALLTHRU */
  case XFS_IOC_RESVSP:
  case XFS_IOC_RESVSP64:
diff --git a/fs/xfs/xfs_vnodeops.h b/fs/xfs/xfs_vnodeops.h
index 52fafc4..d48141d 100644
--- a/fs/xfs/xfs_vnodeops.h
+++ b/fs/xfs/xfs_vnodeops.h
@@ -48,8 +48,6 @@ int xfs_attr_set(struct xfs_inode *dp, const unsigned char *name,
 int xfs_attr_remove(struct xfs_inode *dp, const unsigned char *name, int flags);
 int xfs_attr_list(struct xfs_inode *dp, char *buffer, int bufsize,
  int flags, struct attrlist_cursor_kern *cursor);
-void xfs_tosspages(struct xfs_inode *inode, xfs_off_t first,
- xfs_off_t last, int fiopt);
 int xfs_flushinval_pages(struct xfs_inode *ip, xfs_off_t first,
  xfs_off_t last, int fiopt);
 int xfs_flush_pages(struct xfs_inode *ip, xfs_off_t first,
--
1.7.10

_______________________________________________
xfs mailing list
[hidden email]
http://oss.sgi.com/mailman/listinfo/xfs
Reply | Threaded
Open this post in threaded view
|

[PATCH 03/32] xfs: remove xfs_wait_on_pages()

Dave Chinner
In reply to this post by Dave Chinner
From: Dave Chinner <[hidden email]>

It's just a simple wrapper around a VFS function that is only called
by another function in xfs_fs_subr.c. Remove it and call the VFS
function directly.

Signed-off-by: Dave Chinner <[hidden email]>
---
 fs/xfs/xfs_fs_subr.c  |   18 ++----------------
 fs/xfs/xfs_vnodeops.h |    1 -
 2 files changed, 2 insertions(+), 17 deletions(-)

diff --git a/fs/xfs/xfs_fs_subr.c b/fs/xfs/xfs_fs_subr.c
index d49de3d..3365823 100644
--- a/fs/xfs/xfs_fs_subr.c
+++ b/fs/xfs/xfs_fs_subr.c
@@ -62,23 +62,9 @@ xfs_flush_pages(
  last == -1 ? LLONG_MAX : last);
  if (flags & XBF_ASYNC)
  return ret;
- ret2 = xfs_wait_on_pages(ip, first, last);
+ ret2 = -filemap_fdatawait_range(mapping, first,
+ last == -1 ? XFS_ISIZE(ip) - 1 : last);
  if (!ret)
  ret = ret2;
  return ret;
 }
-
-int
-xfs_wait_on_pages(
- xfs_inode_t *ip,
- xfs_off_t first,
- xfs_off_t last)
-{
- struct address_space *mapping = VFS_I(ip)->i_mapping;
-
- if (mapping_tagged(mapping, PAGECACHE_TAG_WRITEBACK)) {
- return -filemap_fdatawait_range(mapping, first,
- last == -1 ? XFS_ISIZE(ip) - 1 : last);
- }
- return 0;
-}
diff --git a/fs/xfs/xfs_vnodeops.h b/fs/xfs/xfs_vnodeops.h
index d48141d..c8ad48b 100644
--- a/fs/xfs/xfs_vnodeops.h
+++ b/fs/xfs/xfs_vnodeops.h
@@ -52,7 +52,6 @@ int xfs_flushinval_pages(struct xfs_inode *ip, xfs_off_t first,
  xfs_off_t last, int fiopt);
 int xfs_flush_pages(struct xfs_inode *ip, xfs_off_t first,
  xfs_off_t last, uint64_t flags, int fiopt);
-int xfs_wait_on_pages(struct xfs_inode *ip, xfs_off_t first, xfs_off_t last);
 
 int xfs_zero_eof(struct xfs_inode *, xfs_off_t, xfs_fsize_t);
 int xfs_free_eofblocks(struct xfs_mount *, struct xfs_inode *, bool);
--
1.7.10

_______________________________________________
xfs mailing list
[hidden email]
http://oss.sgi.com/mailman/listinfo/xfs
Reply | Threaded
Open this post in threaded view
|

[PATCH 04/32] xfs: remove xfs_flush_pages

Dave Chinner
In reply to this post by Dave Chinner
From: Dave Chinner <[hidden email]>

It is a complex wrapper around VFS functions, but there are VFS
functions that provide exactly the same functionality. Call the VFS
functions directly and remove the unnecessary indirection and
complexity.

We don't need to care about clearing the XFS_ITRUNCATED flag, as
that is done during .writepages. Hence is cleared by the VFS
writeback path if there is anything to write back during the flush.

Signed-off-by: Dave Chinner <[hidden email]>
---
 fs/xfs/xfs_aops.c     |    2 +-
 fs/xfs/xfs_bmap.c     |    2 +-
 fs/xfs/xfs_fs_subr.c  |   24 ------------------------
 fs/xfs/xfs_iops.c     |    4 ++--
 fs/xfs/xfs_vnodeops.c |    7 +++++--
 fs/xfs/xfs_vnodeops.h |    2 --
 6 files changed, 9 insertions(+), 32 deletions(-)

diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c
index e57e2da..71361da 100644
--- a/fs/xfs/xfs_aops.c
+++ b/fs/xfs/xfs_aops.c
@@ -1641,7 +1641,7 @@ xfs_vm_bmap(
 
  trace_xfs_vm_bmap(XFS_I(inode));
  xfs_ilock(ip, XFS_IOLOCK_SHARED);
- xfs_flush_pages(ip, (xfs_off_t)0, -1, 0, FI_REMAPF);
+ filemap_write_and_wait(mapping);
  xfs_iunlock(ip, XFS_IOLOCK_SHARED);
  return generic_block_bmap(mapping, block, xfs_get_blocks);
 }
diff --git a/fs/xfs/xfs_bmap.c b/fs/xfs/xfs_bmap.c
index 83d0cf3..a60f3d1 100644
--- a/fs/xfs/xfs_bmap.c
+++ b/fs/xfs/xfs_bmap.c
@@ -5599,7 +5599,7 @@ xfs_getbmap(
  xfs_ilock(ip, XFS_IOLOCK_SHARED);
  if (whichfork == XFS_DATA_FORK && !(iflags & BMV_IF_DELALLOC)) {
  if (ip->i_delayed_blks || XFS_ISIZE(ip) > ip->i_d.di_size) {
- error = xfs_flush_pages(ip, 0, -1, 0, FI_REMAPF);
+ error = -filemap_write_and_wait(VFS_I(ip)->i_mapping);
  if (error)
  goto out_unlock_iolock;
  }
diff --git a/fs/xfs/xfs_fs_subr.c b/fs/xfs/xfs_fs_subr.c
index 3365823..b538089 100644
--- a/fs/xfs/xfs_fs_subr.c
+++ b/fs/xfs/xfs_fs_subr.c
@@ -44,27 +44,3 @@ xfs_flushinval_pages(
  truncate_inode_pages_range(mapping, first, last);
  return -ret;
 }
-
-int
-xfs_flush_pages(
- xfs_inode_t *ip,
- xfs_off_t first,
- xfs_off_t last,
- uint64_t flags,
- int fiopt)
-{
- struct address_space *mapping = VFS_I(ip)->i_mapping;
- int ret = 0;
- int ret2;
-
- xfs_iflags_clear(ip, XFS_ITRUNCATED);
- ret = -filemap_fdatawrite_range(mapping, first,
- last == -1 ? LLONG_MAX : last);
- if (flags & XBF_ASYNC)
- return ret;
- ret2 = -filemap_fdatawait_range(mapping, first,
- last == -1 ? XFS_ISIZE(ip) - 1 : last);
- if (!ret)
- ret = ret2;
- return ret;
-}
diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
index 81f5c49..d82efaa 100644
--- a/fs/xfs/xfs_iops.c
+++ b/fs/xfs/xfs_iops.c
@@ -780,8 +780,8 @@ xfs_setattr_size(
  * care about here.
  */
  if (oldsize != ip->i_d.di_size && newsize > ip->i_d.di_size) {
- error = xfs_flush_pages(ip, ip->i_d.di_size, newsize, 0,
- FI_NONE);
+ error = -filemap_write_and_wait_range(VFS_I(ip)->i_mapping,
+      ip->i_d.di_size, newsize);
  if (error)
  goto out_unlock;
  }
diff --git a/fs/xfs/xfs_vnodeops.c b/fs/xfs/xfs_vnodeops.c
index f7de578..31de73e 100644
--- a/fs/xfs/xfs_vnodeops.c
+++ b/fs/xfs/xfs_vnodeops.c
@@ -428,8 +428,11 @@ xfs_release(
  truncated = xfs_iflags_test_and_clear(ip, XFS_ITRUNCATED);
  if (truncated) {
  xfs_iflags_clear(ip, XFS_IDIRTY_RELEASE);
- if (VN_DIRTY(VFS_I(ip)) && ip->i_delayed_blks > 0)
- xfs_flush_pages(ip, 0, -1, XBF_ASYNC, FI_NONE);
+ if (VN_DIRTY(VFS_I(ip)) && ip->i_delayed_blks > 0) {
+ error = -filemap_flush(VFS_I(ip)->i_mapping);
+ if (error)
+ return error;
+ }
  }
  }
 
diff --git a/fs/xfs/xfs_vnodeops.h b/fs/xfs/xfs_vnodeops.h
index c8ad48b..73cb3cb 100644
--- a/fs/xfs/xfs_vnodeops.h
+++ b/fs/xfs/xfs_vnodeops.h
@@ -50,8 +50,6 @@ int xfs_attr_list(struct xfs_inode *dp, char *buffer, int bufsize,
  int flags, struct attrlist_cursor_kern *cursor);
 int xfs_flushinval_pages(struct xfs_inode *ip, xfs_off_t first,
  xfs_off_t last, int fiopt);
-int xfs_flush_pages(struct xfs_inode *ip, xfs_off_t first,
- xfs_off_t last, uint64_t flags, int fiopt);
 
 int xfs_zero_eof(struct xfs_inode *, xfs_off_t, xfs_fsize_t);
 int xfs_free_eofblocks(struct xfs_mount *, struct xfs_inode *, bool);
--
1.7.10

_______________________________________________
xfs mailing list
[hidden email]
http://oss.sgi.com/mailman/listinfo/xfs
Reply | Threaded
Open this post in threaded view
|

[PATCH 05/32] xfs: remove xfs_flushinval_pages

Dave Chinner
In reply to this post by Dave Chinner
From: Dave Chinner <[hidden email]>

It's just a simple wrapper around VFS functionality, and is actually
bugging in that it doesn't remove mappings before invalidating the
page cache. Remove it and replace it with the correct VFS
functionality.

Signed-off-by: Dave Chinner <[hidden email]>
---
 fs/xfs/Makefile       |    1 -
 fs/xfs/xfs_dfrag.c    |   10 ++++------
 fs/xfs/xfs_file.c     |   23 ++++++++++++-----------
 fs/xfs/xfs_fs_subr.c  |   46 ----------------------------------------------
 fs/xfs/xfs_vnodeops.c |   11 +++++------
 fs/xfs/xfs_vnodeops.h |    2 --
 6 files changed, 21 insertions(+), 72 deletions(-)
 delete mode 100644 fs/xfs/xfs_fs_subr.c

diff --git a/fs/xfs/Makefile b/fs/xfs/Makefile
index e65357b..d02201d 100644
--- a/fs/xfs/Makefile
+++ b/fs/xfs/Makefile
@@ -37,7 +37,6 @@ xfs-y += xfs_aops.o \
    xfs_file.o \
    xfs_filestream.o \
    xfs_fsops.o \
-   xfs_fs_subr.o \
    xfs_globals.o \
    xfs_icache.o \
    xfs_ioctl.o \
diff --git a/fs/xfs/xfs_dfrag.c b/fs/xfs/xfs_dfrag.c
index b2c63a2..d0e9c74 100644
--- a/fs/xfs/xfs_dfrag.c
+++ b/fs/xfs/xfs_dfrag.c
@@ -246,12 +246,10 @@ xfs_swap_extents(
  goto out_unlock;
  }
 
- if (VN_CACHED(VFS_I(tip)) != 0) {
- error = xfs_flushinval_pages(tip, 0, -1,
- FI_REMAPF_LOCKED);
- if (error)
- goto out_unlock;
- }
+ error = -filemap_write_and_wait(VFS_I(ip)->i_mapping);
+ if (error)
+ goto out_unlock;
+ truncate_pagecache_range(VFS_I(ip), 0, -1);
 
  /* Verify O_DIRECT for ftmp */
  if (VN_CACHED(VFS_I(tip)) != 0) {
diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index daf4066..c42f99e 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -255,15 +255,14 @@ xfs_file_aio_read(
  xfs_buftarg_t *target =
  XFS_IS_REALTIME_INODE(ip) ?
  mp->m_rtdev_targp : mp->m_ddev_targp;
- if ((iocb->ki_pos & target->bt_smask) ||
-    (size & target->bt_smask)) {
- if (iocb->ki_pos == i_size_read(inode))
+ if ((pos & target->bt_smask) || (size & target->bt_smask)) {
+ if (pos == i_size_read(inode))
  return 0;
  return -XFS_ERROR(EINVAL);
  }
  }
 
- n = mp->m_super->s_maxbytes - iocb->ki_pos;
+ n = mp->m_super->s_maxbytes - pos;
  if (n <= 0 || size == 0)
  return 0;
 
@@ -289,20 +288,21 @@ xfs_file_aio_read(
  xfs_rw_ilock(ip, XFS_IOLOCK_EXCL);
 
  if (inode->i_mapping->nrpages) {
- ret = -xfs_flushinval_pages(ip,
- (iocb->ki_pos & PAGE_CACHE_MASK),
- -1, FI_REMAPF_LOCKED);
+ ret = -filemap_write_and_wait_range(
+ VFS_I(ip)->i_mapping,
+ pos, -1);
  if (ret) {
  xfs_rw_iunlock(ip, XFS_IOLOCK_EXCL);
  return ret;
  }
+ truncate_pagecache_range(VFS_I(ip), pos, -1);
  }
  xfs_rw_ilock_demote(ip, XFS_IOLOCK_EXCL);
  }
 
- trace_xfs_file_read(ip, size, iocb->ki_pos, ioflags);
+ trace_xfs_file_read(ip, size, pos, ioflags);
 
- ret = generic_file_aio_read(iocb, iovp, nr_segs, iocb->ki_pos);
+ ret = generic_file_aio_read(iocb, iovp, nr_segs, pos);
  if (ret > 0)
  XFS_STATS_ADD(xs_read_bytes, ret);
 
@@ -670,10 +670,11 @@ xfs_file_dio_aio_write(
  goto out;
 
  if (mapping->nrpages) {
- ret = -xfs_flushinval_pages(ip, (pos & PAGE_CACHE_MASK), -1,
- FI_REMAPF_LOCKED);
+ ret = -filemap_write_and_wait_range(VFS_I(ip)->i_mapping,
+    pos, -1);
  if (ret)
  goto out;
+ truncate_pagecache_range(VFS_I(ip), pos, -1);
  }
 
  /*
diff --git a/fs/xfs/xfs_fs_subr.c b/fs/xfs/xfs_fs_subr.c
deleted file mode 100644
index b538089..0000000
--- a/fs/xfs/xfs_fs_subr.c
+++ /dev/null
@@ -1,46 +0,0 @@
-/*
- * Copyright (c) 2000-2002,2005-2006 Silicon Graphics, Inc.
- * All Rights Reserved.
- *
- * This program is free software; you can redistribute it and/or
- * modify it under the terms of the GNU General Public License as
- * published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it would be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
- * GNU General Public License for more details.
- *
- * You should have received a copy of the GNU General Public License
- * along with this program; if not, write the Free Software Foundation,
- * Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
- */
-#include "xfs.h"
-#include "xfs_vnodeops.h"
-#include "xfs_bmap_btree.h"
-#include "xfs_inode.h"
-#include "xfs_trace.h"
-
-/*
- * note: all filemap functions return negative error codes. These
- * need to be inverted before returning to the xfs core functions.
- */
-int
-xfs_flushinval_pages(
- xfs_inode_t *ip,
- xfs_off_t first,
- xfs_off_t last,
- int fiopt)
-{
- struct address_space *mapping = VFS_I(ip)->i_mapping;
- int ret = 0;
-
- trace_xfs_pagecache_inval(ip, first, last);
-
- xfs_iflags_clear(ip, XFS_ITRUNCATED);
- ret = filemap_write_and_wait_range(mapping, first,
- last == -1 ? LLONG_MAX : last);
- if (!ret)
- truncate_inode_pages_range(mapping, first, last);
- return -ret;
-}
diff --git a/fs/xfs/xfs_vnodeops.c b/fs/xfs/xfs_vnodeops.c
index 31de73e..165cb92 100644
--- a/fs/xfs/xfs_vnodeops.c
+++ b/fs/xfs/xfs_vnodeops.c
@@ -1958,12 +1958,11 @@ xfs_free_file_space(
 
  rounding = max_t(uint, 1 << mp->m_sb.sb_blocklog, PAGE_CACHE_SIZE);
  ioffset = offset & ~(rounding - 1);
-
- if (VN_CACHED(VFS_I(ip)) != 0) {
- error = xfs_flushinval_pages(ip, ioffset, -1, FI_REMAPF_LOCKED);
- if (error)
- goto out_unlock_iolock;
- }
+ error = -filemap_write_and_wait_range(VFS_I(ip)->i_mapping,
+      ioffset, -1);
+ if (error)
+ goto out_unlock_iolock;
+ truncate_pagecache_range(VFS_I(ip), ioffset, -1);
 
  /*
  * Need to zero the stuff we're not freeing, on disk.
diff --git a/fs/xfs/xfs_vnodeops.h b/fs/xfs/xfs_vnodeops.h
index 73cb3cb..91a03fa 100644
--- a/fs/xfs/xfs_vnodeops.h
+++ b/fs/xfs/xfs_vnodeops.h
@@ -48,8 +48,6 @@ int xfs_attr_set(struct xfs_inode *dp, const unsigned char *name,
 int xfs_attr_remove(struct xfs_inode *dp, const unsigned char *name, int flags);
 int xfs_attr_list(struct xfs_inode *dp, char *buffer, int bufsize,
  int flags, struct attrlist_cursor_kern *cursor);
-int xfs_flushinval_pages(struct xfs_inode *ip, xfs_off_t first,
- xfs_off_t last, int fiopt);
 
 int xfs_zero_eof(struct xfs_inode *, xfs_off_t, xfs_fsize_t);
 int xfs_free_eofblocks(struct xfs_mount *, struct xfs_inode *, bool);
--
1.7.10

_______________________________________________
xfs mailing list
[hidden email]
http://oss.sgi.com/mailman/listinfo/xfs
Reply | Threaded
Open this post in threaded view
|

[PATCH 06/32] xfs: use btree block initialisation functions in growfs

Dave Chinner
In reply to this post by Dave Chinner
From: Dave Chinner <[hidden email]>

Factor xfs_btree_init_block() to be independent of the btree cursor,
and use the function to initialise btree blocks in the growfs code.
This makes adding support for different format btree blocks simple.

Signed-off-by: Dave Chinner <[hidden email]>
---
 fs/xfs/xfs_btree.c |   33 ++++++++++++++++++++++++---------
 fs/xfs/xfs_btree.h |   11 +++++++++++
 fs/xfs/xfs_fsops.c |   37 +++++++++++++------------------------
 3 files changed, 48 insertions(+), 33 deletions(-)

diff --git a/fs/xfs/xfs_btree.c b/fs/xfs/xfs_btree.c
index e53e317..121ea99 100644
--- a/fs/xfs/xfs_btree.c
+++ b/fs/xfs/xfs_btree.c
@@ -853,18 +853,22 @@ xfs_btree_set_sibling(
  }
 }
 
-STATIC void
+void
 xfs_btree_init_block(
- struct xfs_btree_cur *cur,
- int level,
- int numrecs,
- struct xfs_btree_block *new) /* new block */
+ struct xfs_mount *mp,
+ struct xfs_buf *bp,
+ __u32 magic,
+ __u16 level,
+ __u16 numrecs,
+ unsigned int flags)
 {
- new->bb_magic = cpu_to_be32(xfs_magics[cur->bc_btnum]);
+ struct xfs_btree_block *new = XFS_BUF_TO_BLOCK(bp);
+
+ new->bb_magic = cpu_to_be32(magic);
  new->bb_level = cpu_to_be16(level);
  new->bb_numrecs = cpu_to_be16(numrecs);
 
- if (cur->bc_flags & XFS_BTREE_LONG_PTRS) {
+ if (flags & XFS_BTREE_LONG_PTRS) {
  new->bb_u.l.bb_leftsib = cpu_to_be64(NULLDFSBNO);
  new->bb_u.l.bb_rightsib = cpu_to_be64(NULLDFSBNO);
  } else {
@@ -873,6 +877,17 @@ xfs_btree_init_block(
  }
 }
 
+STATIC void
+xfs_btree_init_block_cur(
+ struct xfs_btree_cur *cur,
+ int level,
+ int numrecs,
+ struct xfs_buf *bp)
+{
+ xfs_btree_init_block(cur->bc_mp, bp, xfs_magics[cur->bc_btnum],
+       level, numrecs, cur->bc_flags);
+}
+
 /*
  * Return true if ptr is the last record in the btree and
  * we need to track updateŃ• to this record.  The decision
@@ -2183,7 +2198,7 @@ xfs_btree_split(
  goto error0;
 
  /* Fill in the btree header for the new right block. */
- xfs_btree_init_block(cur, xfs_btree_get_level(left), 0, right);
+ xfs_btree_init_block_cur(cur, xfs_btree_get_level(left), 0, rbp);
 
  /*
  * Split the entries between the old and the new block evenly.
@@ -2492,7 +2507,7 @@ xfs_btree_new_root(
  nptr = 2;
  }
  /* Fill in the new block's btree header and log it. */
- xfs_btree_init_block(cur, cur->bc_nlevels, 2, new);
+ xfs_btree_init_block_cur(cur, cur->bc_nlevels, 2, nbp);
  xfs_btree_log_block(cur, nbp, XFS_BB_ALL_BITS);
  ASSERT(!xfs_btree_ptr_is_null(cur, &lptr) &&
  !xfs_btree_ptr_is_null(cur, &rptr));
diff --git a/fs/xfs/xfs_btree.h b/fs/xfs/xfs_btree.h
index 5b240de..c9cf2d0 100644
--- a/fs/xfs/xfs_btree.h
+++ b/fs/xfs/xfs_btree.h
@@ -378,6 +378,17 @@ xfs_btree_reada_bufs(
  xfs_agblock_t agbno, /* allocation group block number */
  xfs_extlen_t count); /* count of filesystem blocks */
 
+/*
+ * Initialise a new btree block header
+ */
+void
+xfs_btree_init_block(
+ struct xfs_mount *mp,
+ struct xfs_buf *bp,
+ __u32 magic,
+ __u16 level,
+ __u16 numrecs,
+ unsigned int flags);
 
 /*
  * Common btree core entry points.
diff --git a/fs/xfs/xfs_fsops.c b/fs/xfs/xfs_fsops.c
index 7b0a997..a5034af 100644
--- a/fs/xfs/xfs_fsops.c
+++ b/fs/xfs/xfs_fsops.c
@@ -125,7 +125,6 @@ xfs_growfs_data_private(
  xfs_extlen_t agsize;
  xfs_extlen_t tmpsize;
  xfs_alloc_rec_t *arec;
- struct xfs_btree_block *block;
  xfs_buf_t *bp;
  int bucket;
  int dpct;
@@ -263,17 +262,14 @@ xfs_growfs_data_private(
  error = ENOMEM;
  goto error0;
  }
- block = XFS_BUF_TO_BLOCK(bp);
- memset(block, 0, mp->m_sb.sb_blocksize);
- block->bb_magic = cpu_to_be32(XFS_ABTB_MAGIC);
- block->bb_level = 0;
- block->bb_numrecs = cpu_to_be16(1);
- block->bb_u.s.bb_leftsib = cpu_to_be32(NULLAGBLOCK);
- block->bb_u.s.bb_rightsib = cpu_to_be32(NULLAGBLOCK);
- arec = XFS_ALLOC_REC_ADDR(mp, block, 1);
+ xfs_buf_zero(bp, 0, BBTOB(bp->b_length));
+ xfs_btree_init_block(mp, bp, XFS_ABTB_MAGIC, 0, 1, 0);
+
+ arec = XFS_ALLOC_REC_ADDR(mp, XFS_BUF_TO_BLOCK(bp), 1);
  arec->ar_startblock = cpu_to_be32(XFS_PREALLOC_BLOCKS(mp));
  arec->ar_blockcount = cpu_to_be32(
  agsize - be32_to_cpu(arec->ar_startblock));
+
  error = xfs_bwrite(bp);
  xfs_buf_relse(bp);
  if (error)
@@ -289,18 +285,15 @@ xfs_growfs_data_private(
  error = ENOMEM;
  goto error0;
  }
- block = XFS_BUF_TO_BLOCK(bp);
- memset(block, 0, mp->m_sb.sb_blocksize);
- block->bb_magic = cpu_to_be32(XFS_ABTC_MAGIC);
- block->bb_level = 0;
- block->bb_numrecs = cpu_to_be16(1);
- block->bb_u.s.bb_leftsib = cpu_to_be32(NULLAGBLOCK);
- block->bb_u.s.bb_rightsib = cpu_to_be32(NULLAGBLOCK);
- arec = XFS_ALLOC_REC_ADDR(mp, block, 1);
+ xfs_buf_zero(bp, 0, BBTOB(bp->b_length));
+ xfs_btree_init_block(mp, bp, XFS_ABTC_MAGIC, 0, 1, 0);
+
+ arec = XFS_ALLOC_REC_ADDR(mp, XFS_BUF_TO_BLOCK(bp), 1);
  arec->ar_startblock = cpu_to_be32(XFS_PREALLOC_BLOCKS(mp));
  arec->ar_blockcount = cpu_to_be32(
  agsize - be32_to_cpu(arec->ar_startblock));
  nfree += be32_to_cpu(arec->ar_blockcount);
+
  error = xfs_bwrite(bp);
  xfs_buf_relse(bp);
  if (error)
@@ -316,13 +309,9 @@ xfs_growfs_data_private(
  error = ENOMEM;
  goto error0;
  }
- block = XFS_BUF_TO_BLOCK(bp);
- memset(block, 0, mp->m_sb.sb_blocksize);
- block->bb_magic = cpu_to_be32(XFS_IBT_MAGIC);
- block->bb_level = 0;
- block->bb_numrecs = 0;
- block->bb_u.s.bb_leftsib = cpu_to_be32(NULLAGBLOCK);
- block->bb_u.s.bb_rightsib = cpu_to_be32(NULLAGBLOCK);
+ xfs_buf_zero(bp, 0, BBTOB(bp->b_length));
+ xfs_btree_init_block(mp, bp, XFS_IBT_MAGIC, 0, 0, 0);
+
  error = xfs_bwrite(bp);
  xfs_buf_relse(bp);
  if (error)
--
1.7.10

_______________________________________________
xfs mailing list
[hidden email]
http://oss.sgi.com/mailman/listinfo/xfs
Reply | Threaded
Open this post in threaded view
|

[PATCH 07/32] xfs: growfs: use uncached buffers for new headers

Dave Chinner
In reply to this post by Dave Chinner
From: Dave Chinner <[hidden email]>

When writing the new AG headers to disk, we can't attach write
verifiers because they have a dependency on the struct xfs-perag
being attached to the buffer to be fully initialised and growfs
can't fully initialise them until later in the process.

The simplest way to avoid this problem is to use uncached buffers
for writing the new headers. These buffers don't have the xfs-perag
attached to them, so it's simple to detect in the write verifier and
be able to skip the checks that need the xfs-perag.

This enables us to attach the appropriate buffer ops to the buffer
and hence calculate CRCs on the way to disk. IT also means that the
buffer is torn down immediately, and so the first access to the AG
headers will re-read the header from disk and perform full
verification of the buffer. This way we also can catch corruptions
due to problems that went undetected in growfs.

Signed-off-by: Dave Chinner <[hidden email]>
---
 fs/xfs/xfs_fsops.c |   63 ++++++++++++++++++++++++++++++++++------------------
 1 file changed, 41 insertions(+), 22 deletions(-)

diff --git a/fs/xfs/xfs_fsops.c b/fs/xfs/xfs_fsops.c
index a5034af..2196830 100644
--- a/fs/xfs/xfs_fsops.c
+++ b/fs/xfs/xfs_fsops.c
@@ -114,6 +114,26 @@ xfs_fs_geometry(
  return 0;
 }
 
+static struct xfs_buf *
+xfs_growfs_get_hdr_buf(
+ struct xfs_mount *mp,
+ xfs_daddr_t blkno,
+ size_t numblks,
+ int flags)
+{
+ struct xfs_buf *bp;
+
+ bp = xfs_buf_get_uncached(mp->m_ddev_targp, numblks, flags);
+ if (!bp)
+ return NULL;
+
+ xfs_buf_zero(bp, 0, BBTOB(bp->b_length));
+ bp->b_bn = blkno;
+ bp->b_maps[0].bm_bn = blkno;
+
+ return bp;
+}
+
 static int
 xfs_growfs_data_private(
  xfs_mount_t *mp, /* mount point for filesystem */
@@ -189,15 +209,15 @@ xfs_growfs_data_private(
  /*
  * AG freelist header block
  */
- bp = xfs_buf_get(mp->m_ddev_targp,
- XFS_AG_DADDR(mp, agno, XFS_AGF_DADDR(mp)),
- XFS_FSS_TO_BB(mp, 1), 0);
+ bp = xfs_growfs_get_hdr_buf(mp,
+ XFS_AG_DADDR(mp, agno, XFS_AGF_DADDR(mp)),
+ XFS_FSS_TO_BB(mp, 1), 0);
  if (!bp) {
  error = ENOMEM;
  goto error0;
  }
+
  agf = XFS_BUF_TO_AGF(bp);
- memset(agf, 0, mp->m_sb.sb_sectsize);
  agf->agf_magicnum = cpu_to_be32(XFS_AGF_MAGIC);
  agf->agf_versionnum = cpu_to_be32(XFS_AGF_VERSION);
  agf->agf_seqno = cpu_to_be32(agno);
@@ -226,15 +246,15 @@ xfs_growfs_data_private(
  /*
  * AG inode header block
  */
- bp = xfs_buf_get(mp->m_ddev_targp,
- XFS_AG_DADDR(mp, agno, XFS_AGI_DADDR(mp)),
- XFS_FSS_TO_BB(mp, 1), 0);
+ bp = xfs_growfs_get_hdr_buf(mp,
+ XFS_AG_DADDR(mp, agno, XFS_AGI_DADDR(mp)),
+ XFS_FSS_TO_BB(mp, 1), 0);
  if (!bp) {
  error = ENOMEM;
  goto error0;
  }
+
  agi = XFS_BUF_TO_AGI(bp);
- memset(agi, 0, mp->m_sb.sb_sectsize);
  agi->agi_magicnum = cpu_to_be32(XFS_AGI_MAGIC);
  agi->agi_versionnum = cpu_to_be32(XFS_AGI_VERSION);
  agi->agi_seqno = cpu_to_be32(agno);
@@ -255,16 +275,16 @@ xfs_growfs_data_private(
  /*
  * BNO btree root block
  */
- bp = xfs_buf_get(mp->m_ddev_targp,
- XFS_AGB_TO_DADDR(mp, agno, XFS_BNO_BLOCK(mp)),
- BTOBB(mp->m_sb.sb_blocksize), 0);
+ bp = xfs_growfs_get_hdr_buf(mp,
+ XFS_AGB_TO_DADDR(mp, agno, XFS_BNO_BLOCK(mp)),
+ BTOBB(mp->m_sb.sb_blocksize), 0);
+
  if (!bp) {
  error = ENOMEM;
  goto error0;
  }
- xfs_buf_zero(bp, 0, BBTOB(bp->b_length));
- xfs_btree_init_block(mp, bp, XFS_ABTB_MAGIC, 0, 1, 0);
 
+ xfs_btree_init_block(mp, bp, XFS_ABTB_MAGIC, 0, 1, 0);
  arec = XFS_ALLOC_REC_ADDR(mp, XFS_BUF_TO_BLOCK(bp), 1);
  arec->ar_startblock = cpu_to_be32(XFS_PREALLOC_BLOCKS(mp));
  arec->ar_blockcount = cpu_to_be32(
@@ -278,16 +298,15 @@ xfs_growfs_data_private(
  /*
  * CNT btree root block
  */
- bp = xfs_buf_get(mp->m_ddev_targp,
- XFS_AGB_TO_DADDR(mp, agno, XFS_CNT_BLOCK(mp)),
- BTOBB(mp->m_sb.sb_blocksize), 0);
+ bp = xfs_growfs_get_hdr_buf(mp,
+ XFS_AGB_TO_DADDR(mp, agno, XFS_CNT_BLOCK(mp)),
+ BTOBB(mp->m_sb.sb_blocksize), 0);
  if (!bp) {
  error = ENOMEM;
  goto error0;
  }
- xfs_buf_zero(bp, 0, BBTOB(bp->b_length));
- xfs_btree_init_block(mp, bp, XFS_ABTC_MAGIC, 0, 1, 0);
 
+ xfs_btree_init_block(mp, bp, XFS_ABTC_MAGIC, 0, 1, 0);
  arec = XFS_ALLOC_REC_ADDR(mp, XFS_BUF_TO_BLOCK(bp), 1);
  arec->ar_startblock = cpu_to_be32(XFS_PREALLOC_BLOCKS(mp));
  arec->ar_blockcount = cpu_to_be32(
@@ -302,14 +321,14 @@ xfs_growfs_data_private(
  /*
  * INO btree root block
  */
- bp = xfs_buf_get(mp->m_ddev_targp,
- XFS_AGB_TO_DADDR(mp, agno, XFS_IBT_BLOCK(mp)),
- BTOBB(mp->m_sb.sb_blocksize), 0);
+ bp = xfs_growfs_get_hdr_buf(mp,
+ XFS_AGB_TO_DADDR(mp, agno, XFS_IBT_BLOCK(mp)),
+ BTOBB(mp->m_sb.sb_blocksize), 0);
  if (!bp) {
  error = ENOMEM;
  goto error0;
  }
- xfs_buf_zero(bp, 0, BBTOB(bp->b_length));
+
  xfs_btree_init_block(mp, bp, XFS_IBT_MAGIC, 0, 0, 0);
 
  error = xfs_bwrite(bp);
--
1.7.10

_______________________________________________
xfs mailing list
[hidden email]
http://oss.sgi.com/mailman/listinfo/xfs
Reply | Threaded
Open this post in threaded view
|

[PATCH 08/32] xfs: make growfs initialise the AGFL header

Dave Chinner
In reply to this post by Dave Chinner
From: Dave Chinner <[hidden email]>

For verification purposes, AGFLs need to be initialised to a known
set of values. For upcoming CRC changes, they are also headers that
need to be initialised. Currently, growfs does neither for the AGFLs
- it ignores them completely. Add initialisation of the AGFL to be
full of invalid block numbers (NULLAGBLOCK) to put the
infrastructure in place needed for CRC support.

Includes a comment clarification from Jeff Liu.

Signed-off-by: Dave Chinner <[hidden email]>
---
 fs/xfs/xfs_fsops.c |   23 ++++++++++++++++++++++-
 1 file changed, 22 insertions(+), 1 deletion(-)

diff --git a/fs/xfs/xfs_fsops.c b/fs/xfs/xfs_fsops.c
index 2196830..bd9cb7f 100644
--- a/fs/xfs/xfs_fsops.c
+++ b/fs/xfs/xfs_fsops.c
@@ -140,6 +140,7 @@ xfs_growfs_data_private(
  xfs_growfs_data_t *in) /* growfs data input struct */
 {
  xfs_agf_t *agf;
+ struct xfs_agfl *agfl;
  xfs_agi_t *agi;
  xfs_agnumber_t agno;
  xfs_extlen_t agsize;
@@ -207,7 +208,7 @@ xfs_growfs_data_private(
  nfree = 0;
  for (agno = nagcount - 1; agno >= oagcount; agno--, new -= agsize) {
  /*
- * AG freelist header block
+ * AG freespace header block
  */
  bp = xfs_growfs_get_hdr_buf(mp,
  XFS_AG_DADDR(mp, agno, XFS_AGF_DADDR(mp)),
@@ -244,6 +245,26 @@ xfs_growfs_data_private(
  goto error0;
 
  /*
+ * AG freelist header block
+ */
+ bp = xfs_growfs_get_hdr_buf(mp,
+ XFS_AG_DADDR(mp, agno, XFS_AGFL_DADDR(mp)),
+ XFS_FSS_TO_BB(mp, 1), 0);
+ if (!bp) {
+ error = ENOMEM;
+ goto error0;
+ }
+
+ agfl = XFS_BUF_TO_AGFL(bp);
+ for (bucket = 0; bucket < XFS_AGFL_SIZE(mp); bucket++)
+ agfl->agfl_bno[bucket] = cpu_to_be32(NULLAGBLOCK);
+
+ error = xfs_bwrite(bp);
+ xfs_buf_relse(bp);
+ if (error)
+ goto error0;
+
+ /*
  * AG inode header block
  */
  bp = xfs_growfs_get_hdr_buf(mp,
--
1.7.10

_______________________________________________
xfs mailing list
[hidden email]
http://oss.sgi.com/mailman/listinfo/xfs
Reply | Threaded
Open this post in threaded view
|

[PATCH 09/32] xfs: make buffer read verication an IO completion function

Dave Chinner
In reply to this post by Dave Chinner
From: Dave Chinner <[hidden email]>

Add a verifier function callback capability to the buffer read
interfaces.  This will be used by the callers to supply a function
that verifies the contents of the buffer when it is read from disk.
This patch does not provide callback functions, but simply modifies
the interfaces to allow them to be called.

The reason for adding this to the read interfaces is that it is very
difficult to tell fom the outside is a buffer was just read from
disk or whether we just pulled it out of cache. Supplying a callbck
allows the buffer cache to use it's internal knowledge of the buffer
to execute it only when the buffer is read from disk.

It is intended that the verifier functions will mark the buffer with
an EFSCORRUPTED error when verification fails. This allows the
reading context to distinguish a verification error from an IO
error, and potentially take further actions on the buffer (e.g.
attempt repair) based on the error reported.

Signed-off-by: Dave Chinner <[hidden email]>
Reviewed-by: Christoph Hellwig <[hidden email]>
Reviewed-by: Phil White <[hidden email]>
---
 fs/xfs/xfs_alloc.c       |    4 ++--
 fs/xfs/xfs_attr.c        |    2 +-
 fs/xfs/xfs_btree.c       |   21 ++++++++++++---------
 fs/xfs/xfs_buf.c         |   13 +++++++++----
 fs/xfs/xfs_buf.h         |   20 ++++++++++++--------
 fs/xfs/xfs_da_btree.c    |    4 ++--
 fs/xfs/xfs_dir2_leaf.c   |    2 +-
 fs/xfs/xfs_dquot.c       |    4 ++--
 fs/xfs/xfs_fsops.c       |    4 ++--
 fs/xfs/xfs_ialloc.c      |    2 +-
 fs/xfs/xfs_inode.c       |    2 +-
 fs/xfs/xfs_log.c         |    3 +--
 fs/xfs/xfs_log_recover.c |    8 +++++---
 fs/xfs/xfs_mount.c       |    6 +++---
 fs/xfs/xfs_qm.c          |    5 +++--
 fs/xfs/xfs_rtalloc.c     |    6 +++---
 fs/xfs/xfs_trans.h       |   19 ++++++++-----------
 fs/xfs/xfs_trans_buf.c   |    9 ++++++---
 fs/xfs/xfs_vnodeops.c    |    2 +-
 19 files changed, 75 insertions(+), 61 deletions(-)

diff --git a/fs/xfs/xfs_alloc.c b/fs/xfs/xfs_alloc.c
index 3cd7542..34dcb7c 100644
--- a/fs/xfs/xfs_alloc.c
+++ b/fs/xfs/xfs_alloc.c
@@ -447,7 +447,7 @@ xfs_alloc_read_agfl(
  error = xfs_trans_read_buf(
  mp, tp, mp->m_ddev_targp,
  XFS_AG_DADDR(mp, agno, XFS_AGFL_DADDR(mp)),
- XFS_FSS_TO_BB(mp, 1), 0, &bp);
+ XFS_FSS_TO_BB(mp, 1), 0, &bp, NULL);
  if (error)
  return error;
  ASSERT(!xfs_buf_geterror(bp));
@@ -2110,7 +2110,7 @@ xfs_read_agf(
  error = xfs_trans_read_buf(
  mp, tp, mp->m_ddev_targp,
  XFS_AG_DADDR(mp, agno, XFS_AGF_DADDR(mp)),
- XFS_FSS_TO_BB(mp, 1), flags, bpp);
+ XFS_FSS_TO_BB(mp, 1), flags, bpp, NULL);
  if (error)
  return error;
  if (!*bpp)
diff --git a/fs/xfs/xfs_attr.c b/fs/xfs/xfs_attr.c
index 55bbe98..474c57a 100644
--- a/fs/xfs/xfs_attr.c
+++ b/fs/xfs/xfs_attr.c
@@ -1994,7 +1994,7 @@ xfs_attr_rmtval_get(xfs_da_args_t *args)
  dblkno = XFS_FSB_TO_DADDR(mp, map[i].br_startblock);
  blkcnt = XFS_FSB_TO_BB(mp, map[i].br_blockcount);
  error = xfs_trans_read_buf(mp, NULL, mp->m_ddev_targp,
-   dblkno, blkcnt, 0, &bp);
+   dblkno, blkcnt, 0, &bp, NULL);
  if (error)
  return(error);
 
diff --git a/fs/xfs/xfs_btree.c b/fs/xfs/xfs_btree.c
index 121ea99..7e79116 100644
--- a/fs/xfs/xfs_btree.c
+++ b/fs/xfs/xfs_btree.c
@@ -266,9 +266,12 @@ xfs_btree_dup_cursor(
  for (i = 0; i < new->bc_nlevels; i++) {
  new->bc_ptrs[i] = cur->bc_ptrs[i];
  new->bc_ra[i] = cur->bc_ra[i];
- if ((bp = cur->bc_bufs[i])) {
- if ((error = xfs_trans_read_buf(mp, tp, mp->m_ddev_targp,
- XFS_BUF_ADDR(bp), mp->m_bsize, 0, &bp))) {
+ bp = cur->bc_bufs[i];
+ if (bp) {
+ error = xfs_trans_read_buf(mp, tp, mp->m_ddev_targp,
+   XFS_BUF_ADDR(bp), mp->m_bsize,
+   0, &bp, NULL);
+ if (error) {
  xfs_btree_del_cursor(new, error);
  *ncur = NULL;
  return error;
@@ -624,10 +627,10 @@ xfs_btree_read_bufl(
 
  ASSERT(fsbno != NULLFSBLOCK);
  d = XFS_FSB_TO_DADDR(mp, fsbno);
- if ((error = xfs_trans_read_buf(mp, tp, mp->m_ddev_targp, d,
- mp->m_bsize, lock, &bp))) {
+ error = xfs_trans_read_buf(mp, tp, mp->m_ddev_targp, d,
+   mp->m_bsize, lock, &bp, NULL);
+ if (error)
  return error;
- }
  ASSERT(!xfs_buf_geterror(bp));
  if (bp)
  xfs_buf_set_ref(bp, refval);
@@ -650,7 +653,7 @@ xfs_btree_reada_bufl(
 
  ASSERT(fsbno != NULLFSBLOCK);
  d = XFS_FSB_TO_DADDR(mp, fsbno);
- xfs_buf_readahead(mp->m_ddev_targp, d, mp->m_bsize * count);
+ xfs_buf_readahead(mp->m_ddev_targp, d, mp->m_bsize * count, NULL);
 }
 
 /*
@@ -670,7 +673,7 @@ xfs_btree_reada_bufs(
  ASSERT(agno != NULLAGNUMBER);
  ASSERT(agbno != NULLAGBLOCK);
  d = XFS_AGB_TO_DADDR(mp, agno, agbno);
- xfs_buf_readahead(mp->m_ddev_targp, d, mp->m_bsize * count);
+ xfs_buf_readahead(mp->m_ddev_targp, d, mp->m_bsize * count, NULL);
 }
 
 STATIC int
@@ -1013,7 +1016,7 @@ xfs_btree_read_buf_block(
 
  d = xfs_btree_ptr_to_daddr(cur, ptr);
  error = xfs_trans_read_buf(mp, cur->bc_tp, mp->m_ddev_targp, d,
-   mp->m_bsize, flags, bpp);
+   mp->m_bsize, flags, bpp, NULL);
  if (error)
  return error;
 
diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
index 4b0b8dd..0298dd6 100644
--- a/fs/xfs/xfs_buf.c
+++ b/fs/xfs/xfs_buf.c
@@ -654,7 +654,8 @@ xfs_buf_read_map(
  struct xfs_buftarg *target,
  struct xfs_buf_map *map,
  int nmaps,
- xfs_buf_flags_t flags)
+ xfs_buf_flags_t flags,
+ xfs_buf_iodone_t verify)
 {
  struct xfs_buf *bp;
 
@@ -666,6 +667,7 @@ xfs_buf_read_map(
 
  if (!XFS_BUF_ISDONE(bp)) {
  XFS_STATS_INC(xb_get_read);
+ bp->b_iodone = verify;
  _xfs_buf_read(bp, flags);
  } else if (flags & XBF_ASYNC) {
  /*
@@ -691,13 +693,14 @@ void
 xfs_buf_readahead_map(
  struct xfs_buftarg *target,
  struct xfs_buf_map *map,
- int nmaps)
+ int nmaps,
+ xfs_buf_iodone_t verify)
 {
  if (bdi_read_congested(target->bt_bdi))
  return;
 
  xfs_buf_read_map(target, map, nmaps,
-     XBF_TRYLOCK|XBF_ASYNC|XBF_READ_AHEAD);
+     XBF_TRYLOCK|XBF_ASYNC|XBF_READ_AHEAD, verify);
 }
 
 /*
@@ -709,7 +712,8 @@ xfs_buf_read_uncached(
  struct xfs_buftarg *target,
  xfs_daddr_t daddr,
  size_t numblks,
- int flags)
+ int flags,
+ xfs_buf_iodone_t verify)
 {
  xfs_buf_t *bp;
  int error;
@@ -723,6 +727,7 @@ xfs_buf_read_uncached(
  bp->b_bn = daddr;
  bp->b_maps[0].bm_bn = daddr;
  bp->b_flags |= XBF_READ;
+ bp->b_iodone = verify;
 
  xfsbdstrat(target->bt_mount, bp);
  error = xfs_buf_iowait(bp);
diff --git a/fs/xfs/xfs_buf.h b/fs/xfs/xfs_buf.h
index 7c0b6a0..677b1dc 100644
--- a/fs/xfs/xfs_buf.h
+++ b/fs/xfs/xfs_buf.h
@@ -100,6 +100,7 @@ typedef struct xfs_buftarg {
 struct xfs_buf;
 typedef void (*xfs_buf_iodone_t)(struct xfs_buf *);
 
+
 #define XB_PAGES 2
 
 struct xfs_buf_map {
@@ -159,7 +160,6 @@ typedef struct xfs_buf {
 #endif
 } xfs_buf_t;
 
-
 /* Finding and Reading Buffers */
 struct xfs_buf *_xfs_buf_find(struct xfs_buftarg *target,
       struct xfs_buf_map *map, int nmaps,
@@ -196,9 +196,10 @@ struct xfs_buf *xfs_buf_get_map(struct xfs_buftarg *target,
        xfs_buf_flags_t flags);
 struct xfs_buf *xfs_buf_read_map(struct xfs_buftarg *target,
        struct xfs_buf_map *map, int nmaps,
-       xfs_buf_flags_t flags);
+       xfs_buf_flags_t flags, xfs_buf_iodone_t verify);
 void xfs_buf_readahead_map(struct xfs_buftarg *target,
-       struct xfs_buf_map *map, int nmaps);
+       struct xfs_buf_map *map, int nmaps,
+       xfs_buf_iodone_t verify);
 
 static inline struct xfs_buf *
 xfs_buf_get(
@@ -216,20 +217,22 @@ xfs_buf_read(
  struct xfs_buftarg *target,
  xfs_daddr_t blkno,
  size_t numblks,
- xfs_buf_flags_t flags)
+ xfs_buf_flags_t flags,
+ xfs_buf_iodone_t verify)
 {
  DEFINE_SINGLE_BUF_MAP(map, blkno, numblks);
- return xfs_buf_read_map(target, &map, 1, flags);
+ return xfs_buf_read_map(target, &map, 1, flags, verify);
 }
 
 static inline void
 xfs_buf_readahead(
  struct xfs_buftarg *target,
  xfs_daddr_t blkno,
- size_t numblks)
+ size_t numblks,
+ xfs_buf_iodone_t verify)
 {
  DEFINE_SINGLE_BUF_MAP(map, blkno, numblks);
- return xfs_buf_readahead_map(target, &map, 1);
+ return xfs_buf_readahead_map(target, &map, 1, verify);
 }
 
 struct xfs_buf *xfs_buf_get_empty(struct xfs_buftarg *target, size_t numblks);
@@ -239,7 +242,8 @@ int xfs_buf_associate_memory(struct xfs_buf *bp, void *mem, size_t length);
 struct xfs_buf *xfs_buf_get_uncached(struct xfs_buftarg *target, size_t numblks,
  int flags);
 struct xfs_buf *xfs_buf_read_uncached(struct xfs_buftarg *target,
- xfs_daddr_t daddr, size_t numblks, int flags);
+ xfs_daddr_t daddr, size_t numblks, int flags,
+ xfs_buf_iodone_t verify);
 void xfs_buf_hold(struct xfs_buf *bp);
 
 /* Releasing Buffers */
diff --git a/fs/xfs/xfs_da_btree.c b/fs/xfs/xfs_da_btree.c
index c62e7e6..4af8bad 100644
--- a/fs/xfs/xfs_da_btree.c
+++ b/fs/xfs/xfs_da_btree.c
@@ -2161,7 +2161,7 @@ xfs_da_read_buf(
 
  error = xfs_trans_read_buf_map(dp->i_mount, trans,
  dp->i_mount->m_ddev_targp,
- mapp, nmap, 0, &bp);
+ mapp, nmap, 0, &bp, NULL);
  if (error)
  goto out_free;
 
@@ -2237,7 +2237,7 @@ xfs_da_reada_buf(
  }
 
  mappedbno = mapp[0].bm_bn;
- xfs_buf_readahead_map(dp->i_mount->m_ddev_targp, mapp, nmap);
+ xfs_buf_readahead_map(dp->i_mount->m_ddev_targp, mapp, nmap, NULL);
 
 out_free:
  if (mapp != &map)
diff --git a/fs/xfs/xfs_dir2_leaf.c b/fs/xfs/xfs_dir2_leaf.c
index 0b29625..bac8698 100644
--- a/fs/xfs/xfs_dir2_leaf.c
+++ b/fs/xfs/xfs_dir2_leaf.c
@@ -926,7 +926,7 @@ xfs_dir2_leaf_readbuf(
  XFS_FSB_TO_DADDR(mp,
  map[mip->ra_index].br_startblock +
  mip->ra_offset),
- (int)BTOBB(mp->m_dirblksize));
+ (int)BTOBB(mp->m_dirblksize), NULL);
  mip->ra_current = i;
  }
 
diff --git a/fs/xfs/xfs_dquot.c b/fs/xfs/xfs_dquot.c
index bf27fcc..e95f800 100644
--- a/fs/xfs/xfs_dquot.c
+++ b/fs/xfs/xfs_dquot.c
@@ -439,7 +439,7 @@ xfs_qm_dqtobp(
  error = xfs_trans_read_buf(mp, tp, mp->m_ddev_targp,
    dqp->q_blkno,
    mp->m_quotainfo->qi_dqchunklen,
-   0, &bp);
+   0, &bp, NULL);
  if (error || !bp)
  return XFS_ERROR(error);
  }
@@ -920,7 +920,7 @@ xfs_qm_dqflush(
  * Get the buffer containing the on-disk dquot
  */
  error = xfs_trans_read_buf(mp, NULL, mp->m_ddev_targp, dqp->q_blkno,
-   mp->m_quotainfo->qi_dqchunklen, 0, &bp);
+   mp->m_quotainfo->qi_dqchunklen, 0, &bp, NULL);
  if (error)
  goto out_unlock;
 
diff --git a/fs/xfs/xfs_fsops.c b/fs/xfs/xfs_fsops.c
index bd9cb7f..5440768 100644
--- a/fs/xfs/xfs_fsops.c
+++ b/fs/xfs/xfs_fsops.c
@@ -168,7 +168,7 @@ xfs_growfs_data_private(
  dpct = pct - mp->m_sb.sb_imax_pct;
  bp = xfs_buf_read_uncached(mp->m_ddev_targp,
  XFS_FSB_TO_BB(mp, nb) - XFS_FSS_TO_BB(mp, 1),
- XFS_FSS_TO_BB(mp, 1), 0);
+ XFS_FSS_TO_BB(mp, 1), 0, NULL);
  if (!bp)
  return EIO;
  xfs_buf_relse(bp);
@@ -439,7 +439,7 @@ xfs_growfs_data_private(
  if (agno < oagcount) {
  error = xfs_trans_read_buf(mp, NULL, mp->m_ddev_targp,
   XFS_AGB_TO_DADDR(mp, agno, XFS_SB_BLOCK(mp)),
-  XFS_FSS_TO_BB(mp, 1), 0, &bp);
+  XFS_FSS_TO_BB(mp, 1), 0, &bp, NULL);
  } else {
  bp = xfs_trans_get_buf(NULL, mp->m_ddev_targp,
   XFS_AGB_TO_DADDR(mp, agno, XFS_SB_BLOCK(mp)),
diff --git a/fs/xfs/xfs_ialloc.c b/fs/xfs/xfs_ialloc.c
index 37753e1..12e3dea 100644
--- a/fs/xfs/xfs_ialloc.c
+++ b/fs/xfs/xfs_ialloc.c
@@ -1490,7 +1490,7 @@ xfs_read_agi(
 
  error = xfs_trans_read_buf(mp, tp, mp->m_ddev_targp,
  XFS_AG_DADDR(mp, agno, XFS_AGI_DADDR(mp)),
- XFS_FSS_TO_BB(mp, 1), 0, bpp);
+ XFS_FSS_TO_BB(mp, 1), 0, bpp, NULL);
  if (error)
  return error;
 
diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
index 7449cb9..8d69630 100644
--- a/fs/xfs/xfs_inode.c
+++ b/fs/xfs/xfs_inode.c
@@ -408,7 +408,7 @@ xfs_imap_to_bp(
 
  buf_flags |= XBF_UNMAPPED;
  error = xfs_trans_read_buf(mp, tp, mp->m_ddev_targp, imap->im_blkno,
-   (int)imap->im_len, buf_flags, &bp);
+   (int)imap->im_len, buf_flags, &bp, NULL);
  if (error) {
  if (error != EAGAIN) {
  xfs_warn(mp,
diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
index 46b6986..1d6d2ee 100644
--- a/fs/xfs/xfs_log.c
+++ b/fs/xfs/xfs_log.c
@@ -1129,8 +1129,7 @@ xlog_iodone(xfs_buf_t *bp)
  * with it being freed after writing the unmount record to the
  * log.
  */
-
-} /* xlog_iodone */
+}
 
 /*
  * Return size of each in-core log record buffer.
diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
index 3e06333..eb1e29f 100644
--- a/fs/xfs/xfs_log_recover.c
+++ b/fs/xfs/xfs_log_recover.c
@@ -2144,7 +2144,7 @@ xlog_recover_buffer_pass2(
  buf_flags |= XBF_UNMAPPED;
 
  bp = xfs_buf_read(mp->m_ddev_targp, buf_f->blf_blkno, buf_f->blf_len,
-  buf_flags);
+  buf_flags, NULL);
  if (!bp)
  return XFS_ERROR(ENOMEM);
  error = bp->b_error;
@@ -2237,7 +2237,8 @@ xlog_recover_inode_pass2(
  }
  trace_xfs_log_recover_inode_recover(log, in_f);
 
- bp = xfs_buf_read(mp->m_ddev_targp, in_f->ilf_blkno, in_f->ilf_len, 0);
+ bp = xfs_buf_read(mp->m_ddev_targp, in_f->ilf_blkno, in_f->ilf_len, 0,
+  NULL);
  if (!bp) {
  error = ENOMEM;
  goto error;
@@ -2548,7 +2549,8 @@ xlog_recover_dquot_pass2(
  ASSERT(dq_f->qlf_len == 1);
 
  error = xfs_trans_read_buf(mp, NULL, mp->m_ddev_targp, dq_f->qlf_blkno,
-   XFS_FSB_TO_BB(mp, dq_f->qlf_len), 0, &bp);
+   XFS_FSB_TO_BB(mp, dq_f->qlf_len), 0, &bp,
+   NULL);
  if (error)
  return error;
 
diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c
index 41ae7e1..d5402b0 100644
--- a/fs/xfs/xfs_mount.c
+++ b/fs/xfs/xfs_mount.c
@@ -652,7 +652,7 @@ xfs_readsb(xfs_mount_t *mp, int flags)
 
 reread:
  bp = xfs_buf_read_uncached(mp->m_ddev_targp, XFS_SB_DADDR,
- BTOBB(sector_size), 0);
+ BTOBB(sector_size), 0, NULL);
  if (!bp) {
  if (loud)
  xfs_warn(mp, "SB buffer read failed");
@@ -1002,7 +1002,7 @@ xfs_check_sizes(xfs_mount_t *mp)
  }
  bp = xfs_buf_read_uncached(mp->m_ddev_targp,
  d - XFS_FSS_TO_BB(mp, 1),
- XFS_FSS_TO_BB(mp, 1), 0);
+ XFS_FSS_TO_BB(mp, 1), 0, NULL);
  if (!bp) {
  xfs_warn(mp, "last sector read failed");
  return EIO;
@@ -1017,7 +1017,7 @@ xfs_check_sizes(xfs_mount_t *mp)
  }
  bp = xfs_buf_read_uncached(mp->m_logdev_targp,
  d - XFS_FSB_TO_BB(mp, 1),
- XFS_FSB_TO_BB(mp, 1), 0);
+ XFS_FSB_TO_BB(mp, 1), 0, NULL);
  if (!bp) {
  xfs_warn(mp, "log device read failed");
  return EIO;
diff --git a/fs/xfs/xfs_qm.c b/fs/xfs/xfs_qm.c
index 48c750b..688f608 100644
--- a/fs/xfs/xfs_qm.c
+++ b/fs/xfs/xfs_qm.c
@@ -892,7 +892,7 @@ xfs_qm_dqiter_bufs(
  while (blkcnt--) {
  error = xfs_trans_read_buf(mp, NULL, mp->m_ddev_targp,
       XFS_FSB_TO_DADDR(mp, bno),
-      mp->m_quotainfo->qi_dqchunklen, 0, &bp);
+      mp->m_quotainfo->qi_dqchunklen, 0, &bp, NULL);
  if (error)
  break;
 
@@ -979,7 +979,8 @@ xfs_qm_dqiterate(
  while (rablkcnt--) {
  xfs_buf_readahead(mp->m_ddev_targp,
        XFS_FSB_TO_DADDR(mp, rablkno),
-       mp->m_quotainfo->qi_dqchunklen);
+       mp->m_quotainfo->qi_dqchunklen,
+       NULL);
  rablkno++;
  }
  }
diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c
index a69e0b4..b271ed9 100644
--- a/fs/xfs/xfs_rtalloc.c
+++ b/fs/xfs/xfs_rtalloc.c
@@ -870,7 +870,7 @@ xfs_rtbuf_get(
  ASSERT(map.br_startblock != NULLFSBLOCK);
  error = xfs_trans_read_buf(mp, tp, mp->m_ddev_targp,
    XFS_FSB_TO_DADDR(mp, map.br_startblock),
-   mp->m_bsize, 0, &bp);
+   mp->m_bsize, 0, &bp, NULL);
  if (error)
  return error;
  ASSERT(!xfs_buf_geterror(bp));
@@ -1873,7 +1873,7 @@ xfs_growfs_rt(
  */
  bp = xfs_buf_read_uncached(mp->m_rtdev_targp,
  XFS_FSB_TO_BB(mp, nrblocks - 1),
- XFS_FSB_TO_BB(mp, 1), 0);
+ XFS_FSB_TO_BB(mp, 1), 0, NULL);
  if (!bp)
  return EIO;
  xfs_buf_relse(bp);
@@ -2220,7 +2220,7 @@ xfs_rtmount_init(
  }
  bp = xfs_buf_read_uncached(mp->m_rtdev_targp,
  d - XFS_FSB_TO_BB(mp, 1),
- XFS_FSB_TO_BB(mp, 1), 0);
+ XFS_FSB_TO_BB(mp, 1), 0, NULL);
  if (!bp) {
  xfs_warn(mp, "realtime device size check failed");
  return EIO;
diff --git a/fs/xfs/xfs_trans.h b/fs/xfs/xfs_trans.h
index db05654..f02d402 100644
--- a/fs/xfs/xfs_trans.h
+++ b/fs/xfs/xfs_trans.h
@@ -464,10 +464,7 @@ xfs_trans_get_buf(
  int numblks,
  uint flags)
 {
- struct xfs_buf_map map = {
- .bm_bn = blkno,
- .bm_len = numblks,
- };
+ DEFINE_SINGLE_BUF_MAP(map, blkno, numblks);
  return xfs_trans_get_buf_map(tp, target, &map, 1, flags);
 }
 
@@ -476,7 +473,8 @@ int xfs_trans_read_buf_map(struct xfs_mount *mp,
        struct xfs_buftarg *target,
        struct xfs_buf_map *map, int nmaps,
        xfs_buf_flags_t flags,
-       struct xfs_buf **bpp);
+       struct xfs_buf **bpp,
+       xfs_buf_iodone_t verify);
 
 static inline int
 xfs_trans_read_buf(
@@ -486,13 +484,12 @@ xfs_trans_read_buf(
  xfs_daddr_t blkno,
  int numblks,
  xfs_buf_flags_t flags,
- struct xfs_buf **bpp)
+ struct xfs_buf **bpp,
+ xfs_buf_iodone_t verify)
 {
- struct xfs_buf_map map = {
- .bm_bn = blkno,
- .bm_len = numblks,
- };
- return xfs_trans_read_buf_map(mp, tp, target, &map, 1, flags, bpp);
+ DEFINE_SINGLE_BUF_MAP(map, blkno, numblks);
+ return xfs_trans_read_buf_map(mp, tp, target, &map, 1,
+      flags, bpp, verify);
 }
 
 struct xfs_buf *xfs_trans_getsb(xfs_trans_t *, struct xfs_mount *, int);
diff --git a/fs/xfs/xfs_trans_buf.c b/fs/xfs/xfs_trans_buf.c
index 6311b99..9776282 100644
--- a/fs/xfs/xfs_trans_buf.c
+++ b/fs/xfs/xfs_trans_buf.c
@@ -257,7 +257,8 @@ xfs_trans_read_buf_map(
  struct xfs_buf_map *map,
  int nmaps,
  xfs_buf_flags_t flags,
- struct xfs_buf **bpp)
+ struct xfs_buf **bpp,
+ xfs_buf_iodone_t verify)
 {
  xfs_buf_t *bp;
  xfs_buf_log_item_t *bip;
@@ -265,7 +266,7 @@ xfs_trans_read_buf_map(
 
  *bpp = NULL;
  if (!tp) {
- bp = xfs_buf_read_map(target, map, nmaps, flags);
+ bp = xfs_buf_read_map(target, map, nmaps, flags, verify);
  if (!bp)
  return (flags & XBF_TRYLOCK) ?
  EAGAIN : XFS_ERROR(ENOMEM);
@@ -312,7 +313,9 @@ xfs_trans_read_buf_map(
  if (!(XFS_BUF_ISDONE(bp))) {
  trace_xfs_trans_read_buf_io(bp, _RET_IP_);
  ASSERT(!XFS_BUF_ISASYNC(bp));
+ ASSERT(bp->b_iodone == NULL);
  XFS_BUF_READ(bp);
+ bp->b_iodone = verify;
  xfsbdstrat(tp->t_mountp, bp);
  error = xfs_buf_iowait(bp);
  if (error) {
@@ -349,7 +352,7 @@ xfs_trans_read_buf_map(
  return 0;
  }
 
- bp = xfs_buf_read_map(target, map, nmaps, flags);
+ bp = xfs_buf_read_map(target, map, nmaps, flags, verify);
  if (bp == NULL) {
  *bpp = NULL;
  return (flags & XBF_TRYLOCK) ?
diff --git a/fs/xfs/xfs_vnodeops.c b/fs/xfs/xfs_vnodeops.c
index 165cb92..bc70446 100644
--- a/fs/xfs/xfs_vnodeops.c
+++ b/fs/xfs/xfs_vnodeops.c
@@ -80,7 +80,7 @@ xfs_readlink_bmap(
  d = XFS_FSB_TO_DADDR(mp, mval[n].br_startblock);
  byte_cnt = XFS_FSB_TO_B(mp, mval[n].br_blockcount);
 
- bp = xfs_buf_read(mp->m_ddev_targp, d, BTOBB(byte_cnt), 0);
+ bp = xfs_buf_read(mp->m_ddev_targp, d, BTOBB(byte_cnt), 0, NULL);
  if (!bp)
  return XFS_ERROR(ENOMEM);
  error = bp->b_error;
--
1.7.10

_______________________________________________
xfs mailing list
[hidden email]
http://oss.sgi.com/mailman/listinfo/xfs
Reply | Threaded
Open this post in threaded view
|

[PATCH 10/32] xfs: uncached buffer reads need to return an error

Dave Chinner
In reply to this post by Dave Chinner
From: Dave Chinner <[hidden email]>

With verification being done as an IO completion callback, different
errors can be returned from a read. Uncached reads only return a
buffer or NULL on failure, which means the verification error cannot
be returned to the caller.

Split the error handling for these reads into two - a failure to get
a buffer will still return NULL, but a read error will return a
referenced buffer with b_error set rather than NULL. The caller is
responsible for checking the error state of the buffer returned.

Signed-off-by: Dave Chinner <[hidden email]>
Reviewed-by: Christoph Hellwig <[hidden email]>
Reviewed-by: Phil White <[hidden email]>
---
 fs/xfs/xfs_buf.c     |    9 ++-------
 fs/xfs/xfs_fsops.c   |    5 +++++
 fs/xfs/xfs_mount.c   |    6 ++++++
 fs/xfs/xfs_rtalloc.c |    9 ++++++++-
 4 files changed, 21 insertions(+), 8 deletions(-)

diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
index 0298dd6..fbc965f 100644
--- a/fs/xfs/xfs_buf.c
+++ b/fs/xfs/xfs_buf.c
@@ -715,8 +715,7 @@ xfs_buf_read_uncached(
  int flags,
  xfs_buf_iodone_t verify)
 {
- xfs_buf_t *bp;
- int error;
+ struct xfs_buf *bp;
 
  bp = xfs_buf_get_uncached(target, numblks, flags);
  if (!bp)
@@ -730,11 +729,7 @@ xfs_buf_read_uncached(
  bp->b_iodone = verify;
 
  xfsbdstrat(target->bt_mount, bp);
- error = xfs_buf_iowait(bp);
- if (error) {
- xfs_buf_relse(bp);
- return NULL;
- }
+ xfs_buf_iowait(bp);
  return bp;
 }
 
diff --git a/fs/xfs/xfs_fsops.c b/fs/xfs/xfs_fsops.c
index 5440768..f35f8d7 100644
--- a/fs/xfs/xfs_fsops.c
+++ b/fs/xfs/xfs_fsops.c
@@ -171,6 +171,11 @@ xfs_growfs_data_private(
  XFS_FSS_TO_BB(mp, 1), 0, NULL);
  if (!bp)
  return EIO;
+ if (bp->b_error) {
+ int error = bp->b_error;
+ xfs_buf_relse(bp);
+ return error;
+ }
  xfs_buf_relse(bp);
 
  new = nb; /* use new as a temporary here */
diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c
index d5402b0..df6d0b2 100644
--- a/fs/xfs/xfs_mount.c
+++ b/fs/xfs/xfs_mount.c
@@ -658,6 +658,12 @@ reread:
  xfs_warn(mp, "SB buffer read failed");
  return EIO;
  }
+ if (bp->b_error) {
+ error = bp->b_error;
+ if (loud)
+ xfs_warn(mp, "SB validate failed");
+ goto release_buf;
+ }
 
  /*
  * Initialize the mount structure from the superblock.
diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c
index b271ed9..98dc670 100644
--- a/fs/xfs/xfs_rtalloc.c
+++ b/fs/xfs/xfs_rtalloc.c
@@ -1876,6 +1876,11 @@ xfs_growfs_rt(
  XFS_FSB_TO_BB(mp, 1), 0, NULL);
  if (!bp)
  return EIO;
+ if (bp->b_error) {
+ error = bp->b_error;
+ xfs_buf_relse(bp);
+ return error;
+ }
  xfs_buf_relse(bp);
 
  /*
@@ -2221,8 +2226,10 @@ xfs_rtmount_init(
  bp = xfs_buf_read_uncached(mp->m_rtdev_targp,
  d - XFS_FSB_TO_BB(mp, 1),
  XFS_FSB_TO_BB(mp, 1), 0, NULL);
- if (!bp) {
+ if (!bp || bp->b_error) {
  xfs_warn(mp, "realtime device size check failed");
+ if (bp)
+ xfs_buf_relse(bp);
  return EIO;
  }
  xfs_buf_relse(bp);
--
1.7.10

_______________________________________________
xfs mailing list
[hidden email]
http://oss.sgi.com/mailman/listinfo/xfs
Reply | Threaded
Open this post in threaded view
|

[PATCH 11/32] xfs: verify superblocks as they are read from disk

Dave Chinner
In reply to this post by Dave Chinner
From: Dave Chinner <[hidden email]>

Add a superblock verify callback function and pass it into the
buffer read functions. Remove the now redundant verification code
that is currently in use.

Adding verification shows that secondary superblocks never have
their "sb_inprogress" flag cleared by mkfs.xfs, so when validating
the secondary superblocks during a grow operation we have to avoid
checking this field. Even if we fix mkfs, we will still have to
ignore this field for verification purposes unless a version of mkfs
that does not have this bug was used.

Signed-off-by: Dave Chinner <[hidden email]>
Reviewed-by: Phil White <[hidden email]>
---
 fs/xfs/xfs_fsops.c       |    4 +-
 fs/xfs/xfs_log_recover.c |    5 ++-
 fs/xfs/xfs_mount.c       |   98 +++++++++++++++++++++++++++++-----------------
 fs/xfs/xfs_mount.h       |    3 +-
 4 files changed, 69 insertions(+), 41 deletions(-)

diff --git a/fs/xfs/xfs_fsops.c b/fs/xfs/xfs_fsops.c
index f35f8d7..cb65b06 100644
--- a/fs/xfs/xfs_fsops.c
+++ b/fs/xfs/xfs_fsops.c
@@ -444,7 +444,8 @@ xfs_growfs_data_private(
  if (agno < oagcount) {
  error = xfs_trans_read_buf(mp, NULL, mp->m_ddev_targp,
   XFS_AGB_TO_DADDR(mp, agno, XFS_SB_BLOCK(mp)),
-  XFS_FSS_TO_BB(mp, 1), 0, &bp, NULL);
+  XFS_FSS_TO_BB(mp, 1), 0, &bp,
+  xfs_sb_read_verify);
  } else {
  bp = xfs_trans_get_buf(NULL, mp->m_ddev_targp,
   XFS_AGB_TO_DADDR(mp, agno, XFS_SB_BLOCK(mp)),
@@ -462,6 +463,7 @@ xfs_growfs_data_private(
  break;
  }
  xfs_sb_to_disk(XFS_BUF_TO_SBP(bp), &mp->m_sb, XFS_SB_ALL_BITS);
+
  /*
  * If we get an error writing out the alternate superblocks,
  * just issue a warning and continue.  The real work is
diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
index eb1e29f..924a4bc 100644
--- a/fs/xfs/xfs_log_recover.c
+++ b/fs/xfs/xfs_log_recover.c
@@ -3692,13 +3692,14 @@ xlog_do_recover(
 
  /*
  * Now that we've finished replaying all buffer and inode
- * updates, re-read in the superblock.
+ * updates, re-read in the superblock and reverify it.
  */
  bp = xfs_getsb(log->l_mp, 0);
  XFS_BUF_UNDONE(bp);
  ASSERT(!(XFS_BUF_ISWRITE(bp)));
  XFS_BUF_READ(bp);
  XFS_BUF_UNASYNC(bp);
+ bp->b_iodone = xfs_sb_read_verify;
  xfsbdstrat(log->l_mp, bp);
  error = xfs_buf_iowait(bp);
  if (error) {
@@ -3710,7 +3711,7 @@ xlog_do_recover(
 
  /* Convert superblock from on-disk format */
  sbp = &log->l_mp->m_sb;
- xfs_sb_from_disk(log->l_mp, XFS_BUF_TO_SBP(bp));
+ xfs_sb_from_disk(sbp, XFS_BUF_TO_SBP(bp));
  ASSERT(sbp->sb_magicnum == XFS_SB_MAGIC);
  ASSERT(xfs_sb_good_version(sbp));
  xfs_buf_relse(bp);
diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c
index df6d0b2..bff18d7 100644
--- a/fs/xfs/xfs_mount.c
+++ b/fs/xfs/xfs_mount.c
@@ -304,9 +304,8 @@ STATIC int
 xfs_mount_validate_sb(
  xfs_mount_t *mp,
  xfs_sb_t *sbp,
- int flags)
+ bool check_inprogress)
 {
- int loud = !(flags & XFS_MFSI_QUIET);
 
  /*
  * If the log device and data device have the
@@ -316,21 +315,18 @@ xfs_mount_validate_sb(
  * a volume filesystem in a non-volume manner.
  */
  if (sbp->sb_magicnum != XFS_SB_MAGIC) {
- if (loud)
- xfs_warn(mp, "bad magic number");
+ xfs_warn(mp, "bad magic number");
  return XFS_ERROR(EWRONGFS);
  }
 
  if (!xfs_sb_good_version(sbp)) {
- if (loud)
- xfs_warn(mp, "bad version");
+ xfs_warn(mp, "bad version");
  return XFS_ERROR(EWRONGFS);
  }
 
  if (unlikely(
     sbp->sb_logstart == 0 && mp->m_logdev_targp == mp->m_ddev_targp)) {
- if (loud)
- xfs_warn(mp,
+ xfs_warn(mp,
  "filesystem is marked as having an external log; "
  "specify logdev on the mount command line.");
  return XFS_ERROR(EINVAL);
@@ -338,8 +334,7 @@ xfs_mount_validate_sb(
 
  if (unlikely(
     sbp->sb_logstart != 0 && mp->m_logdev_targp != mp->m_ddev_targp)) {
- if (loud)
- xfs_warn(mp,
+ xfs_warn(mp,
  "filesystem is marked as having an internal log; "
  "do not specify logdev on the mount command line.");
  return XFS_ERROR(EINVAL);
@@ -373,8 +368,7 @@ xfs_mount_validate_sb(
     sbp->sb_dblocks == 0 ||
     sbp->sb_dblocks > XFS_MAX_DBLOCKS(sbp) ||
     sbp->sb_dblocks < XFS_MIN_DBLOCKS(sbp))) {
- if (loud)
- XFS_CORRUPTION_ERROR("SB sanity check failed",
+ XFS_CORRUPTION_ERROR("SB sanity check failed",
  XFS_ERRLEVEL_LOW, mp, sbp);
  return XFS_ERROR(EFSCORRUPTED);
  }
@@ -383,12 +377,10 @@ xfs_mount_validate_sb(
  * Until this is fixed only page-sized or smaller data blocks work.
  */
  if (unlikely(sbp->sb_blocksize > PAGE_SIZE)) {
- if (loud) {
- xfs_warn(mp,
+ xfs_warn(mp,
  "File system with blocksize %d bytes. "
  "Only pagesize (%ld) or less will currently work.",
  sbp->sb_blocksize, PAGE_SIZE);
- }
  return XFS_ERROR(ENOSYS);
  }
 
@@ -402,23 +394,20 @@ xfs_mount_validate_sb(
  case 2048:
  break;
  default:
- if (loud)
- xfs_warn(mp, "inode size of %d bytes not supported",
+ xfs_warn(mp, "inode size of %d bytes not supported",
  sbp->sb_inodesize);
  return XFS_ERROR(ENOSYS);
  }
 
  if (xfs_sb_validate_fsb_count(sbp, sbp->sb_dblocks) ||
     xfs_sb_validate_fsb_count(sbp, sbp->sb_rblocks)) {
- if (loud)
- xfs_warn(mp,
+ xfs_warn(mp,
  "file system too large to be mounted on this system.");
  return XFS_ERROR(EFBIG);
  }
 
- if (unlikely(sbp->sb_inprogress)) {
- if (loud)
- xfs_warn(mp, "file system busy");
+ if (check_inprogress && sbp->sb_inprogress) {
+ xfs_warn(mp, "Offline file system operation in progress!");
  return XFS_ERROR(EFSCORRUPTED);
  }
 
@@ -426,9 +415,7 @@ xfs_mount_validate_sb(
  * Version 1 directory format has never worked on Linux.
  */
  if (unlikely(!xfs_sb_version_hasdirv2(sbp))) {
- if (loud)
- xfs_warn(mp,
- "file system using version 1 directory format");
+ xfs_warn(mp, "file system using version 1 directory format");
  return XFS_ERROR(ENOSYS);
  }
 
@@ -521,11 +508,9 @@ out_unwind:
 
 void
 xfs_sb_from_disk(
- struct xfs_mount *mp,
+ struct xfs_sb *to,
  xfs_dsb_t *from)
 {
- struct xfs_sb *to = &mp->m_sb;
-
  to->sb_magicnum = be32_to_cpu(from->sb_magicnum);
  to->sb_blocksize = be32_to_cpu(from->sb_blocksize);
  to->sb_dblocks = be64_to_cpu(from->sb_dblocks);
@@ -627,6 +612,50 @@ xfs_sb_to_disk(
  }
 }
 
+void
+xfs_sb_read_verify(
+ struct xfs_buf *bp)
+{
+ struct xfs_mount *mp = bp->b_target->bt_mount;
+ struct xfs_sb sb;
+ int error;
+
+ xfs_sb_from_disk(&sb, XFS_BUF_TO_SBP(bp));
+
+ /*
+ * Only check the in progress field for the primary superblock as
+ * mkfs.xfs doesn't clear it from secondary superblocks.
+ */
+ error = xfs_mount_validate_sb(mp, &sb, bp->b_bn == XFS_SB_DADDR);
+ if (error)
+ xfs_buf_ioerror(bp, error);
+ bp->b_iodone = NULL;
+ xfs_buf_ioend(bp, 0);
+}
+
+/*
+ * We may be probed for a filesystem match, so we may not want to emit
+ * messages when the superblock buffer is not actually an XFS superblock.
+ * If we find an XFS superblock, the run a normal, noisy mount because we are
+ * really going to mount it and want to know about errors.
+ */
+void
+xfs_sb_quiet_read_verify(
+ struct xfs_buf *bp)
+{
+ struct xfs_sb sb;
+
+ xfs_sb_from_disk(&sb, XFS_BUF_TO_SBP(bp));
+
+ if (sb.sb_magicnum == XFS_SB_MAGIC) {
+ /* XFS filesystem, verify noisily! */
+ xfs_sb_read_verify(bp);
+ return;
+ }
+ /* quietly fail */
+ xfs_buf_ioerror(bp, EFSCORRUPTED);
+}
+
 /*
  * xfs_readsb
  *
@@ -652,7 +681,9 @@ xfs_readsb(xfs_mount_t *mp, int flags)
 
 reread:
  bp = xfs_buf_read_uncached(mp->m_ddev_targp, XFS_SB_DADDR,
- BTOBB(sector_size), 0, NULL);
+   BTOBB(sector_size), 0,
+   loud ? xfs_sb_read_verify
+        : xfs_sb_quiet_read_verify);
  if (!bp) {
  if (loud)
  xfs_warn(mp, "SB buffer read failed");
@@ -667,15 +698,8 @@ reread:
 
  /*
  * Initialize the mount structure from the superblock.
- * But first do some basic consistency checking.
  */
- xfs_sb_from_disk(mp, XFS_BUF_TO_SBP(bp));
- error = xfs_mount_validate_sb(mp, &(mp->m_sb), flags);
- if (error) {
- if (loud)
- xfs_warn(mp, "SB validate failed");
- goto release_buf;
- }
+ xfs_sb_from_disk(&mp->m_sb, XFS_BUF_TO_SBP(bp));
 
  /*
  * We must be able to do sector-sized and sector-aligned IO.
diff --git a/fs/xfs/xfs_mount.h b/fs/xfs/xfs_mount.h
index dc306a0..de9089a 100644
--- a/fs/xfs/xfs_mount.h
+++ b/fs/xfs/xfs_mount.h
@@ -385,10 +385,11 @@ extern void xfs_set_low_space_thresholds(struct xfs_mount *);
 
 #endif /* __KERNEL__ */
 
+extern void xfs_sb_read_verify(struct xfs_buf *);
 extern void xfs_mod_sb(struct xfs_trans *, __int64_t);
 extern int xfs_initialize_perag(struct xfs_mount *, xfs_agnumber_t,
  xfs_agnumber_t *);
-extern void xfs_sb_from_disk(struct xfs_mount *, struct xfs_dsb *);
+extern void xfs_sb_from_disk(struct xfs_sb *, struct xfs_dsb *);
 extern void xfs_sb_to_disk(struct xfs_dsb *, struct xfs_sb *, __int64_t);
 
 #endif /* __XFS_MOUNT_H__ */
--
1.7.10

_______________________________________________
xfs mailing list
[hidden email]
http://oss.sgi.com/mailman/listinfo/xfs
Reply | Threaded
Open this post in threaded view
|

[PATCH 12/32] xfs: verify AGF blocks as they are read from disk

Dave Chinner
In reply to this post by Dave Chinner
From: Dave Chinner <[hidden email]>

Add an AGF block verify callback function and pass it into the
buffer read functions. This replaces the existing verification that
is done after the read completes.

Signed-off-by: Dave Chinner <[hidden email]>
Reviewed-by: Christoph Hellwig <[hidden email]>
---
 fs/xfs/xfs_alloc.c |   69 ++++++++++++++++++++++++++++++++--------------------
 1 file changed, 43 insertions(+), 26 deletions(-)

diff --git a/fs/xfs/xfs_alloc.c b/fs/xfs/xfs_alloc.c
index 34dcb7c..cebac40 100644
--- a/fs/xfs/xfs_alloc.c
+++ b/fs/xfs/xfs_alloc.c
@@ -2091,6 +2091,48 @@ xfs_alloc_put_freelist(
  return 0;
 }
 
+static void
+xfs_agf_read_verify(
+ struct xfs_buf *bp)
+ {
+ struct xfs_mount *mp = bp->b_target->bt_mount;
+ struct xfs_agf *agf;
+ int agf_ok;
+
+ agf = XFS_BUF_TO_AGF(bp);
+
+ agf_ok = agf->agf_magicnum == cpu_to_be32(XFS_AGF_MAGIC) &&
+ XFS_AGF_GOOD_VERSION(be32_to_cpu(agf->agf_versionnum)) &&
+ be32_to_cpu(agf->agf_freeblks) <= be32_to_cpu(agf->agf_length) &&
+ be32_to_cpu(agf->agf_flfirst) < XFS_AGFL_SIZE(mp) &&
+ be32_to_cpu(agf->agf_fllast) < XFS_AGFL_SIZE(mp) &&
+ be32_to_cpu(agf->agf_flcount) <= XFS_AGFL_SIZE(mp) &&
+ be32_to_cpu(agf->agf_flcount) <= XFS_AGFL_SIZE(mp);
+
+ /*
+ * during growfs operations, the perag is not fully initialised,
+ * so we can't use it for any useful checking. growfs ensures we can't
+ * use it by using uncached buffers that don't have the perag attached
+ * so we can detect and avoid this problem.
+ */
+ if (bp->b_pag)
+ agf_ok = agf_ok && be32_to_cpu(agf->agf_seqno) ==
+ bp->b_pag->pag_agno;
+
+ if (xfs_sb_version_haslazysbcount(&mp->m_sb))
+ agf_ok = agf_ok && be32_to_cpu(agf->agf_btreeblks) <=
+ be32_to_cpu(agf->agf_length);
+
+ if (unlikely(XFS_TEST_ERROR(!agf_ok, mp, XFS_ERRTAG_ALLOC_READ_AGF,
+ XFS_RANDOM_ALLOC_READ_AGF))) {
+ XFS_CORRUPTION_ERROR(__func__, XFS_ERRLEVEL_LOW, mp, agf);
+ xfs_buf_ioerror(bp, EFSCORRUPTED);
+ }
+
+ bp->b_iodone = NULL;
+ xfs_buf_ioend(bp, 0);
+}
+
 /*
  * Read in the allocation group header (free/alloc section).
  */
@@ -2102,44 +2144,19 @@ xfs_read_agf(
  int flags, /* XFS_BUF_ */
  struct xfs_buf **bpp) /* buffer for the ag freelist header */
 {
- struct xfs_agf *agf; /* ag freelist header */
- int agf_ok; /* set if agf is consistent */
  int error;
 
  ASSERT(agno != NULLAGNUMBER);
  error = xfs_trans_read_buf(
  mp, tp, mp->m_ddev_targp,
  XFS_AG_DADDR(mp, agno, XFS_AGF_DADDR(mp)),
- XFS_FSS_TO_BB(mp, 1), flags, bpp, NULL);
+ XFS_FSS_TO_BB(mp, 1), flags, bpp, xfs_agf_read_verify);
  if (error)
  return error;
  if (!*bpp)
  return 0;
 
  ASSERT(!(*bpp)->b_error);
- agf = XFS_BUF_TO_AGF(*bpp);
-
- /*
- * Validate the magic number of the agf block.
- */
- agf_ok =
- agf->agf_magicnum == cpu_to_be32(XFS_AGF_MAGIC) &&
- XFS_AGF_GOOD_VERSION(be32_to_cpu(agf->agf_versionnum)) &&
- be32_to_cpu(agf->agf_freeblks) <= be32_to_cpu(agf->agf_length) &&
- be32_to_cpu(agf->agf_flfirst) < XFS_AGFL_SIZE(mp) &&
- be32_to_cpu(agf->agf_fllast) < XFS_AGFL_SIZE(mp) &&
- be32_to_cpu(agf->agf_flcount) <= XFS_AGFL_SIZE(mp) &&
- be32_to_cpu(agf->agf_seqno) == agno;
- if (xfs_sb_version_haslazysbcount(&mp->m_sb))
- agf_ok = agf_ok && be32_to_cpu(agf->agf_btreeblks) <=
- be32_to_cpu(agf->agf_length);
- if (unlikely(XFS_TEST_ERROR(!agf_ok, mp, XFS_ERRTAG_ALLOC_READ_AGF,
- XFS_RANDOM_ALLOC_READ_AGF))) {
- XFS_CORRUPTION_ERROR("xfs_alloc_read_agf",
-     XFS_ERRLEVEL_LOW, mp, agf);
- xfs_trans_brelse(tp, *bpp);
- return XFS_ERROR(EFSCORRUPTED);
- }
  xfs_buf_set_ref(*bpp, XFS_AGF_REF);
  return 0;
 }
--
1.7.10

_______________________________________________
xfs mailing list
[hidden email]
http://oss.sgi.com/mailman/listinfo/xfs
Reply | Threaded
Open this post in threaded view
|

[PATCH 13/32] xfs: verify AGI blocks as they are read from disk

Dave Chinner
In reply to this post by Dave Chinner
From: Dave Chinner <[hidden email]>

Add an AGI block verify callback function and pass it into the
buffer read functions. Remove the now redundant verification code
that is currently in use.

Signed-off-by: Dave Chinner <[hidden email]>
Reviewed-by: Christoph Hellwig <[hidden email]>
---
 fs/xfs/xfs_ialloc.c |   56 ++++++++++++++++++++++++++++++++-------------------
 1 file changed, 35 insertions(+), 21 deletions(-)

diff --git a/fs/xfs/xfs_ialloc.c b/fs/xfs/xfs_ialloc.c
index 12e3dea..5bd255e 100644
--- a/fs/xfs/xfs_ialloc.c
+++ b/fs/xfs/xfs_ialloc.c
@@ -1472,6 +1472,40 @@ xfs_check_agi_unlinked(
 #define xfs_check_agi_unlinked(agi)
 #endif
 
+static void
+xfs_agi_read_verify(
+ struct xfs_buf *bp)
+{
+ struct xfs_mount *mp = bp->b_target->bt_mount;
+ struct xfs_agi *agi = XFS_BUF_TO_AGI(bp);
+ int agi_ok;
+
+ /*
+ * Validate the magic number of the agi block.
+ */
+ agi_ok = agi->agi_magicnum == cpu_to_be32(XFS_AGI_MAGIC) &&
+ XFS_AGI_GOOD_VERSION(be32_to_cpu(agi->agi_versionnum));
+
+ /*
+ * during growfs operations, the perag is not fully initialised,
+ * so we can't use it for any useful checking. growfs ensures we can't
+ * use it by using uncached buffers that don't have the perag attached
+ * so we can detect and avoid this problem.
+ */
+ if (bp->b_pag)
+ agi_ok = agi_ok && be32_to_cpu(agi->agi_seqno) ==
+ bp->b_pag->pag_agno;
+
+ if (unlikely(XFS_TEST_ERROR(!agi_ok, mp, XFS_ERRTAG_IALLOC_READ_AGI,
+ XFS_RANDOM_IALLOC_READ_AGI))) {
+ XFS_CORRUPTION_ERROR(__func__, XFS_ERRLEVEL_LOW, mp, agi);
+ xfs_buf_ioerror(bp, EFSCORRUPTED);
+ }
+ xfs_check_agi_unlinked(agi);
+ bp->b_iodone = NULL;
+ xfs_buf_ioend(bp, 0);
+}
+
 /*
  * Read in the allocation group header (inode allocation section)
  */
@@ -1482,38 +1516,18 @@ xfs_read_agi(
  xfs_agnumber_t agno, /* allocation group number */
  struct xfs_buf **bpp) /* allocation group hdr buf */
 {
- struct xfs_agi *agi; /* allocation group header */
- int agi_ok; /* agi is consistent */
  int error;
 
  ASSERT(agno != NULLAGNUMBER);
 
  error = xfs_trans_read_buf(mp, tp, mp->m_ddev_targp,
  XFS_AG_DADDR(mp, agno, XFS_AGI_DADDR(mp)),
- XFS_FSS_TO_BB(mp, 1), 0, bpp, NULL);
+ XFS_FSS_TO_BB(mp, 1), 0, bpp, xfs_agi_read_verify);
  if (error)
  return error;
 
  ASSERT(!xfs_buf_geterror(*bpp));
- agi = XFS_BUF_TO_AGI(*bpp);
-
- /*
- * Validate the magic number of the agi block.
- */
- agi_ok = agi->agi_magicnum == cpu_to_be32(XFS_AGI_MAGIC) &&
- XFS_AGI_GOOD_VERSION(be32_to_cpu(agi->agi_versionnum)) &&
- be32_to_cpu(agi->agi_seqno) == agno;
- if (unlikely(XFS_TEST_ERROR(!agi_ok, mp, XFS_ERRTAG_IALLOC_READ_AGI,
- XFS_RANDOM_IALLOC_READ_AGI))) {
- XFS_CORRUPTION_ERROR("xfs_read_agi", XFS_ERRLEVEL_LOW,
-     mp, agi);
- xfs_trans_brelse(tp, *bpp);
- return XFS_ERROR(EFSCORRUPTED);
- }
-
  xfs_buf_set_ref(*bpp, XFS_AGI_REF);
-
- xfs_check_agi_unlinked(agi);
  return 0;
 }
 
--
1.7.10

_______________________________________________
xfs mailing list
[hidden email]
http://oss.sgi.com/mailman/listinfo/xfs
Reply | Threaded
Open this post in threaded view
|

[PATCH 14/32] xfs: verify AGFL blocks as they are read from disk

Dave Chinner
In reply to this post by Dave Chinner
From: Dave Chinner <[hidden email]>

Add an AGFL block verify callback function and pass it into the
buffer read functions.

While this commit adds verification code to the AGFL, it cannot be
used reliably until the CRC format change comes along as mkfs does
not initialise the full AGFL. Hence it can be full of garbage at the
first mount and will fail verification right now. CRC enabled
filesystems won't have this problem, so leave the code that has
already been written ifdef'd out until the proper time.

Signed-off-by: Dave Chinner <[hidden email]>
Reviewed-by: Phil White <[hidden email]>
---
 fs/xfs/xfs_alloc.c |   39 ++++++++++++++++++++++++++++++++++++++-
 1 file changed, 38 insertions(+), 1 deletion(-)

diff --git a/fs/xfs/xfs_alloc.c b/fs/xfs/xfs_alloc.c
index cebac40..506b346 100644
--- a/fs/xfs/xfs_alloc.c
+++ b/fs/xfs/xfs_alloc.c
@@ -430,6 +430,43 @@ xfs_alloc_fixup_trees(
  return 0;
 }
 
+void
+xfs_agfl_read_verify(
+ struct xfs_buf *bp)
+{
+#ifdef WHEN_CRCS_COME_ALONG
+ /*
+ * we cannot actually do any verification of the AGFL because mkfs does
+ * not initialise the AGFL to zero or NULL. Hence the only valid part of
+ * the AGFL is what the AGF says is active. We can't get to the AGF, so
+ * we can't verify just those entries are valid.
+ *
+ * This problem goes away when the CRC format change comes along as that
+ * requires the AGFL to be initialised by mkfs. At that point, we can
+ * verify the blocks in the agfl -active or not- lie within the bounds
+ * of the AG. Until then, just leave this check ifdef'd out.
+ */
+ struct xfs_mount *mp = bp->b_target->bt_mount;
+ struct xfs_agfl *agfl = XFS_BUF_TO_AGFL(bp);
+ int agfl_ok = 1;
+
+ int i;
+
+ for (i = 0; i < XFS_AGFL_SIZE(mp); i++) {
+ if (be32_to_cpu(agfl->agfl_bno[i]) == NULLAGBLOCK ||
+    be32_to_cpu(agfl->agfl_bno[i]) >= mp->m_sb.sb_agblocks)
+ agfl_ok = 0;
+ }
+
+ if (!agfl_ok) {
+ XFS_CORRUPTION_ERROR(__func__, XFS_ERRLEVEL_LOW, mp, agfl);
+ xfs_buf_ioerror(bp, EFSCORRUPTED);
+ }
+#endif
+ bp->b_iodone = NULL;
+ xfs_buf_ioend(bp, 0);
+}
+
 /*
  * Read in the allocation group free block array.
  */
@@ -447,7 +484,7 @@ xfs_alloc_read_agfl(
  error = xfs_trans_read_buf(
  mp, tp, mp->m_ddev_targp,
  XFS_AG_DADDR(mp, agno, XFS_AGFL_DADDR(mp)),
- XFS_FSS_TO_BB(mp, 1), 0, &bp, NULL);
+ XFS_FSS_TO_BB(mp, 1), 0, &bp, xfs_agfl_read_verify);
  if (error)
  return error;
  ASSERT(!xfs_buf_geterror(bp));
--
1.7.10

_______________________________________________
xfs mailing list
[hidden email]
http://oss.sgi.com/mailman/listinfo/xfs
Reply | Threaded
Open this post in threaded view
|

[PATCH 15/32] xfs: verify inode buffers as they are read from disk

Dave Chinner
In reply to this post by Dave Chinner
From: Dave Chinner <[hidden email]>

Add an inode buffer verify callback function and pass it into the
buffer read functions. Inodes are special in that the verbose checks
will be done when reading the inode, but we still need to sanity
check the buffer when that is first read. Always verify the magic
numbers in all inodes in the buffer, rather than jus ton debug
kernels.

Signed-off-by: Dave Chinner <[hidden email]>
Reviewed-by: Phil White <[hidden email]>
---
 fs/xfs/xfs_inode.c |  100 +++++++++++++++++++++++++++-------------------------
 1 file changed, 51 insertions(+), 49 deletions(-)

diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
index 8d69630..514eac9 100644
--- a/fs/xfs/xfs_inode.c
+++ b/fs/xfs/xfs_inode.c
@@ -382,6 +382,46 @@ xfs_inobp_check(
 }
 #endif
 
+static void
+xfs_inode_buf_verify(
+ struct xfs_buf *bp)
+{
+ struct xfs_mount *mp = bp->b_target->bt_mount;
+ int i;
+ int ni;
+
+ /*
+ * Validate the magic number and version of every inode in the buffer
+ */
+ ni = XFS_BB_TO_FSB(mp, bp->b_length) * mp->m_sb.sb_inopblock;
+ for (i = 0; i < ni; i++) {
+ int di_ok;
+ xfs_dinode_t *dip;
+
+ dip = (struct xfs_dinode *)xfs_buf_offset(bp,
+ (i << mp->m_sb.sb_inodelog));
+ di_ok = dip->di_magic == cpu_to_be16(XFS_DINODE_MAGIC) &&
+    XFS_DINODE_GOOD_VERSION(dip->di_version);
+ if (unlikely(XFS_TEST_ERROR(!di_ok, mp,
+ XFS_ERRTAG_ITOBP_INOTOBP,
+ XFS_RANDOM_ITOBP_INOTOBP))) {
+ xfs_buf_ioerror(bp, EFSCORRUPTED);
+ XFS_CORRUPTION_ERROR(__func__, XFS_ERRLEVEL_HIGH,
+     mp, dip);
+#ifdef DEBUG
+ xfs_emerg(mp,
+ "bad inode magic/vsn daddr %lld #%d (magic=%x)",
+ (unsigned long long)bp->b_bn, i,
+ be16_to_cpu(dip->di_magic));
+ ASSERT(0);
+#endif
+ }
+ }
+ xfs_inobp_check(mp, bp);
+ bp->b_iodone = NULL;
+ xfs_buf_ioend(bp, 0);
+}
+
 /*
  * This routine is called to map an inode to the buffer containing the on-disk
  * version of the inode.  It returns a pointer to the buffer containing the
@@ -396,71 +436,33 @@ xfs_imap_to_bp(
  struct xfs_mount *mp,
  struct xfs_trans *tp,
  struct xfs_imap *imap,
- struct xfs_dinode **dipp,
+ struct xfs_dinode       **dipp,
  struct xfs_buf **bpp,
  uint buf_flags,
  uint iget_flags)
 {
  struct xfs_buf *bp;
  int error;
- int i;
- int ni;
 
  buf_flags |= XBF_UNMAPPED;
  error = xfs_trans_read_buf(mp, tp, mp->m_ddev_targp, imap->im_blkno,
-   (int)imap->im_len, buf_flags, &bp, NULL);
+   (int)imap->im_len, buf_flags, &bp,
+   xfs_inode_buf_verify);
  if (error) {
- if (error != EAGAIN) {
- xfs_warn(mp,
- "%s: xfs_trans_read_buf() returned error %d.",
- __func__, error);
- } else {
+ if (error == EAGAIN) {
  ASSERT(buf_flags & XBF_TRYLOCK);
+ return error;
  }
- return error;
- }
 
- /*
- * Validate the magic number and version of every inode in the buffer
- * (if DEBUG kernel) or the first inode in the buffer, otherwise.
- */
-#ifdef DEBUG
- ni = BBTOB(imap->im_len) >> mp->m_sb.sb_inodelog;
-#else /* usual case */
- ni = 1;
-#endif
+ if (error == EFSCORRUPTED &&
+    (iget_flags & XFS_IGET_UNTRUSTED))
+ return XFS_ERROR(EINVAL);
 
- for (i = 0; i < ni; i++) {
- int di_ok;
- xfs_dinode_t *dip;
-
- dip = (xfs_dinode_t *)xfs_buf_offset(bp,
- (i << mp->m_sb.sb_inodelog));
- di_ok = dip->di_magic == cpu_to_be16(XFS_DINODE_MAGIC) &&
-    XFS_DINODE_GOOD_VERSION(dip->di_version);
- if (unlikely(XFS_TEST_ERROR(!di_ok, mp,
- XFS_ERRTAG_ITOBP_INOTOBP,
- XFS_RANDOM_ITOBP_INOTOBP))) {
- if (iget_flags & XFS_IGET_UNTRUSTED) {
- xfs_trans_brelse(tp, bp);
- return XFS_ERROR(EINVAL);
- }
- XFS_CORRUPTION_ERROR(__func__, XFS_ERRLEVEL_HIGH,
-     mp, dip);
-#ifdef DEBUG
- xfs_emerg(mp,
- "bad inode magic/vsn daddr %lld #%d (magic=%x)",
- (unsigned long long)imap->im_blkno, i,
- be16_to_cpu(dip->di_magic));
- ASSERT(0);
-#endif
- xfs_trans_brelse(tp, bp);
- return XFS_ERROR(EFSCORRUPTED);
- }
+ xfs_warn(mp, "%s: xfs_trans_read_buf() returned error %d.",
+ __func__, error);
+ return error;
  }
 
- xfs_inobp_check(mp, bp);
-
  *bpp = bp;
  *dipp = (struct xfs_dinode *)xfs_buf_offset(bp, imap->im_boffset);
  return 0;
--
1.7.10

_______________________________________________
xfs mailing list
[hidden email]
http://oss.sgi.com/mailman/listinfo/xfs
Reply | Threaded
Open this post in threaded view
|

[PATCH 16/32] xfs: verify btree blocks as they are read from disk

Dave Chinner
In reply to this post by Dave Chinner
From: Dave Chinner <[hidden email]>

Add an btree block verify callback function and pass it into the
buffer read functions. Because each different btree block type
requires different verification, add a function to the ops structure
that is called from the generic code.

Also, propagate the verification callback functions through the
readahead functions, and into the external bmap and bulkstat inode
readahead code that uses the generic btree buffer read functions.

Signed-off-by: Dave Chinner <[hidden email]>
Reviewed-by: Phil White <[hidden email]>
---
 fs/xfs/xfs_alloc_btree.c  |   61 +++++++++++++++++++++++++++++++++++++++++
 fs/xfs/xfs_bmap.c         |   60 ++++++++++++++++++++++++-----------------
 fs/xfs/xfs_bmap_btree.c   |   47 ++++++++++++++++++++++++++++++++
 fs/xfs/xfs_bmap_btree.h   |    1 +
 fs/xfs/xfs_btree.c        |   66 +++++++++++++++++++++++----------------------
 fs/xfs/xfs_btree.h        |   10 ++++---
 fs/xfs/xfs_ialloc_btree.c |   40 +++++++++++++++++++++++++++
 fs/xfs/xfs_inode.c        |    2 +-
 fs/xfs/xfs_inode.h        |    1 +
 fs/xfs/xfs_itable.c       |    3 ++-
 10 files changed, 230 insertions(+), 61 deletions(-)

diff --git a/fs/xfs/xfs_alloc_btree.c b/fs/xfs/xfs_alloc_btree.c
index f7876c6..46961e5 100644
--- a/fs/xfs/xfs_alloc_btree.c
+++ b/fs/xfs/xfs_alloc_btree.c
@@ -272,6 +272,66 @@ xfs_allocbt_key_diff(
  return (__int64_t)be32_to_cpu(kp->ar_startblock) - rec->ar_startblock;
 }
 
+void
+xfs_allocbt_read_verify(
+ struct xfs_buf *bp)
+{
+ struct xfs_mount *mp = bp->b_target->bt_mount;
+ struct xfs_btree_block *block = XFS_BUF_TO_BLOCK(bp);
+ struct xfs_perag *pag = bp->b_pag;
+ unsigned int level;
+ int sblock_ok; /* block passes checks */
+
+ /*
+ * magic number and level verification
+ *
+ * During growfs operations, we can't verify the exact level as the
+ * perag is not fully initialised and hence not attached to the buffer.
+ * In this case, check against the maximum tree depth.
+ */
+ level = be16_to_cpu(block->bb_level);
+ switch (block->bb_magic) {
+ case cpu_to_be32(XFS_ABTB_MAGIC):
+ if (pag)
+ sblock_ok = level < pag->pagf_levels[XFS_BTNUM_BNOi];
+ else
+ sblock_ok = level < mp->m_ag_maxlevels;
+ break;
+ case cpu_to_be32(XFS_ABTC_MAGIC):
+ if (pag)
+ sblock_ok = level < pag->pagf_levels[XFS_BTNUM_CNTi];
+ else
+ sblock_ok = level < mp->m_ag_maxlevels;
+ break;
+ default:
+ sblock_ok = 0;
+ break;
+ }
+
+ /* numrecs verification */
+ sblock_ok = sblock_ok &&
+ be16_to_cpu(block->bb_numrecs) <= mp->m_alloc_mxr[level != 0];
+
+ /* sibling pointer verification */
+ sblock_ok = sblock_ok &&
+ (block->bb_u.s.bb_leftsib == cpu_to_be32(NULLAGBLOCK) ||
+ be32_to_cpu(block->bb_u.s.bb_leftsib) < mp->m_sb.sb_agblocks) &&
+ block->bb_u.s.bb_leftsib &&
+ (block->bb_u.s.bb_rightsib == cpu_to_be32(NULLAGBLOCK) ||
+ be32_to_cpu(block->bb_u.s.bb_rightsib) < mp->m_sb.sb_agblocks) &&
+ block->bb_u.s.bb_rightsib;
+
+ if (!sblock_ok) {
+ trace_xfs_btree_corrupt(bp, _RET_IP_);
+ XFS_CORRUPTION_ERROR("xfs_allocbt_read_verify",
+ XFS_ERRLEVEL_LOW, mp, block);
+ xfs_buf_ioerror(bp, EFSCORRUPTED);
+ }
+
+ bp->b_iodone = NULL;
+ xfs_buf_ioend(bp, 0);
+}
+
 #ifdef DEBUG
 STATIC int
 xfs_allocbt_keys_inorder(
@@ -327,6 +387,7 @@ static const struct xfs_btree_ops xfs_allocbt_ops = {
  .init_rec_from_cur = xfs_allocbt_init_rec_from_cur,
  .init_ptr_from_cur = xfs_allocbt_init_ptr_from_cur,
  .key_diff = xfs_allocbt_key_diff,
+ .read_verify = xfs_allocbt_read_verify,
 #ifdef DEBUG
  .keys_inorder = xfs_allocbt_keys_inorder,
  .recs_inorder = xfs_allocbt_recs_inorder,
diff --git a/fs/xfs/xfs_bmap.c b/fs/xfs/xfs_bmap.c
index a60f3d1..9ae7aba 100644
--- a/fs/xfs/xfs_bmap.c
+++ b/fs/xfs/xfs_bmap.c
@@ -2662,8 +2662,9 @@ xfs_bmap_btree_to_extents(
  if ((error = xfs_btree_check_lptr(cur, cbno, 1)))
  return error;
 #endif
- if ((error = xfs_btree_read_bufl(mp, tp, cbno, 0, &cbp,
- XFS_BMAP_BTREE_REF)))
+ error = xfs_btree_read_bufl(mp, tp, cbno, 0, &cbp, XFS_BMAP_BTREE_REF,
+ xfs_bmbt_read_verify);
+ if (error)
  return error;
  cblock = XFS_BUF_TO_BLOCK(cbp);
  if ((error = xfs_btree_check_block(cur, cblock, 0, cbp)))
@@ -4078,8 +4079,9 @@ xfs_bmap_read_extents(
  * pointer (leftmost) at each level.
  */
  while (level-- > 0) {
- if ((error = xfs_btree_read_bufl(mp, tp, bno, 0, &bp,
- XFS_BMAP_BTREE_REF)))
+ error = xfs_btree_read_bufl(mp, tp, bno, 0, &bp,
+ XFS_BMAP_BTREE_REF, xfs_bmbt_read_verify);
+ if (error)
  return error;
  block = XFS_BUF_TO_BLOCK(bp);
  XFS_WANT_CORRUPTED_GOTO(
@@ -4124,7 +4126,8 @@ xfs_bmap_read_extents(
  */
  nextbno = be64_to_cpu(block->bb_u.l.bb_rightsib);
  if (nextbno != NULLFSBLOCK)
- xfs_btree_reada_bufl(mp, nextbno, 1);
+ xfs_btree_reada_bufl(mp, nextbno, 1,
+     xfs_bmbt_read_verify);
  /*
  * Copy records into the extent records.
  */
@@ -4156,8 +4159,9 @@ xfs_bmap_read_extents(
  */
  if (bno == NULLFSBLOCK)
  break;
- if ((error = xfs_btree_read_bufl(mp, tp, bno, 0, &bp,
- XFS_BMAP_BTREE_REF)))
+ error = xfs_btree_read_bufl(mp, tp, bno, 0, &bp,
+ XFS_BMAP_BTREE_REF, xfs_bmbt_read_verify);
+ if (error)
  return error;
  block = XFS_BUF_TO_BLOCK(bp);
  }
@@ -5868,15 +5872,16 @@ xfs_bmap_check_leaf_extents(
  */
  while (level-- > 0) {
  /* See if buf is in cur first */
+ bp_release = 0;
  bp = xfs_bmap_get_bp(cur, XFS_FSB_TO_DADDR(mp, bno));
- if (bp) {
- bp_release = 0;
- } else {
+ if (!bp) {
  bp_release = 1;
+ error = xfs_btree_read_bufl(mp, NULL, bno, 0, &bp,
+ XFS_BMAP_BTREE_REF,
+ xfs_bmbt_read_verify);
+ if (error)
+ goto error_norelse;
  }
- if (!bp && (error = xfs_btree_read_bufl(mp, NULL, bno, 0, &bp,
- XFS_BMAP_BTREE_REF)))
- goto error_norelse;
  block = XFS_BUF_TO_BLOCK(bp);
  XFS_WANT_CORRUPTED_GOTO(
  xfs_bmap_sanity_check(mp, bp, level),
@@ -5953,15 +5958,16 @@ xfs_bmap_check_leaf_extents(
  if (bno == NULLFSBLOCK)
  break;
 
+ bp_release = 0;
  bp = xfs_bmap_get_bp(cur, XFS_FSB_TO_DADDR(mp, bno));
- if (bp) {
- bp_release = 0;
- } else {
+ if (!bp) {
  bp_release = 1;
+ error = xfs_btree_read_bufl(mp, NULL, bno, 0, &bp,
+ XFS_BMAP_BTREE_REF,
+ xfs_bmbt_read_verify);
+ if (error)
+ goto error_norelse;
  }
- if (!bp && (error = xfs_btree_read_bufl(mp, NULL, bno, 0, &bp,
- XFS_BMAP_BTREE_REF)))
- goto error_norelse;
  block = XFS_BUF_TO_BLOCK(bp);
  }
  if (bp_release) {
@@ -6052,7 +6058,9 @@ xfs_bmap_count_tree(
  struct xfs_btree_block *block, *nextblock;
  int numrecs;
 
- if ((error = xfs_btree_read_bufl(mp, tp, bno, 0, &bp, XFS_BMAP_BTREE_REF)))
+ error = xfs_btree_read_bufl(mp, tp, bno, 0, &bp, XFS_BMAP_BTREE_REF,
+ xfs_bmbt_read_verify);
+ if (error)
  return error;
  *count += 1;
  block = XFS_BUF_TO_BLOCK(bp);
@@ -6061,8 +6069,10 @@ xfs_bmap_count_tree(
  /* Not at node above leaves, count this level of nodes */
  nextbno = be64_to_cpu(block->bb_u.l.bb_rightsib);
  while (nextbno != NULLFSBLOCK) {
- if ((error = xfs_btree_read_bufl(mp, tp, nextbno,
- 0, &nbp, XFS_BMAP_BTREE_REF)))
+ error = xfs_btree_read_bufl(mp, tp, nextbno, 0, &nbp,
+ XFS_BMAP_BTREE_REF,
+ xfs_bmbt_read_verify);
+ if (error)
  return error;
  *count += 1;
  nextblock = XFS_BUF_TO_BLOCK(nbp);
@@ -6091,8 +6101,10 @@ xfs_bmap_count_tree(
  if (nextbno == NULLFSBLOCK)
  break;
  bno = nextbno;
- if ((error = xfs_btree_read_bufl(mp, tp, bno, 0, &bp,
- XFS_BMAP_BTREE_REF)))
+ error = xfs_btree_read_bufl(mp, tp, bno, 0, &bp,
+ XFS_BMAP_BTREE_REF,
+ xfs_bmbt_read_verify);
+ if (error)
  return error;
  *count += 1;
  block = XFS_BUF_TO_BLOCK(bp);
diff --git a/fs/xfs/xfs_bmap_btree.c b/fs/xfs/xfs_bmap_btree.c
index 862084a..bddca9b 100644
--- a/fs/xfs/xfs_bmap_btree.c
+++ b/fs/xfs/xfs_bmap_btree.c
@@ -36,6 +36,7 @@
 #include "xfs_bmap.h"
 #include "xfs_error.h"
 #include "xfs_quota.h"
+#include "xfs_trace.h"
 
 /*
  * Determine the extent state.
@@ -707,6 +708,51 @@ xfs_bmbt_key_diff(
       cur->bc_rec.b.br_startoff;
 }
 
+void
+xfs_bmbt_read_verify(
+ struct xfs_buf *bp)
+{
+ struct xfs_mount *mp = bp->b_target->bt_mount;
+ struct xfs_btree_block *block = XFS_BUF_TO_BLOCK(bp);
+ unsigned int level;
+ int lblock_ok; /* block passes checks */
+
+ /* magic number and level verification.
+ *
+ * We don't know waht fork we belong to, so just verify that the level
+ * is less than the maximum of the two. Later checks will be more
+ * precise.
+ */
+ level = be16_to_cpu(block->bb_level);
+ lblock_ok = block->bb_magic == cpu_to_be32(XFS_BMAP_MAGIC) &&
+    level < max(mp->m_bm_maxlevels[0], mp->m_bm_maxlevels[1]);
+
+ /* numrecs verification */
+ lblock_ok = lblock_ok &&
+ be16_to_cpu(block->bb_numrecs) <= mp->m_bmap_dmxr[level != 0];
+
+ /* sibling pointer verification */
+ lblock_ok = lblock_ok &&
+ block->bb_u.l.bb_leftsib &&
+ (block->bb_u.l.bb_leftsib == cpu_to_be64(NULLDFSBNO) ||
+ XFS_FSB_SANITY_CHECK(mp,
+ be64_to_cpu(block->bb_u.l.bb_leftsib))) &&
+ block->bb_u.l.bb_rightsib &&
+ (block->bb_u.l.bb_rightsib == cpu_to_be64(NULLDFSBNO) ||
+ XFS_FSB_SANITY_CHECK(mp,
+ be64_to_cpu(block->bb_u.l.bb_rightsib)));
+
+ if (!lblock_ok) {
+ trace_xfs_btree_corrupt(bp, _RET_IP_);
+ XFS_CORRUPTION_ERROR("xfs_bmbt_read_verify",
+ XFS_ERRLEVEL_LOW, mp, block);
+ xfs_buf_ioerror(bp, EFSCORRUPTED);
+ }
+
+ bp->b_iodone = NULL;
+ xfs_buf_ioend(bp, 0);
+}
+
 #ifdef DEBUG
 STATIC int
 xfs_bmbt_keys_inorder(
@@ -746,6 +792,7 @@ static const struct xfs_btree_ops xfs_bmbt_ops = {
  .init_rec_from_cur = xfs_bmbt_init_rec_from_cur,
  .init_ptr_from_cur = xfs_bmbt_init_ptr_from_cur,
  .key_diff = xfs_bmbt_key_diff,
+ .read_verify = xfs_bmbt_read_verify,
 #ifdef DEBUG
  .keys_inorder = xfs_bmbt_keys_inorder,
  .recs_inorder = xfs_bmbt_recs_inorder,
diff --git a/fs/xfs/xfs_bmap_btree.h b/fs/xfs/xfs_bmap_btree.h
index 0e66c4e..1d00fbe 100644
--- a/fs/xfs/xfs_bmap_btree.h
+++ b/fs/xfs/xfs_bmap_btree.h
@@ -232,6 +232,7 @@ extern void xfs_bmbt_to_bmdr(struct xfs_mount *, struct xfs_btree_block *, int,
 extern int xfs_bmbt_get_maxrecs(struct xfs_btree_cur *, int level);
 extern int xfs_bmdr_maxrecs(struct xfs_mount *, int blocklen, int leaf);
 extern int xfs_bmbt_maxrecs(struct xfs_mount *, int blocklen, int leaf);
+extern void xfs_bmbt_read_verify(struct xfs_buf *bp);
 
 extern struct xfs_btree_cur *xfs_bmbt_init_cursor(struct xfs_mount *,
  struct xfs_trans *, struct xfs_inode *, int);
diff --git a/fs/xfs/xfs_btree.c b/fs/xfs/xfs_btree.c
index 7e79116..ef10660 100644
--- a/fs/xfs/xfs_btree.c
+++ b/fs/xfs/xfs_btree.c
@@ -270,7 +270,8 @@ xfs_btree_dup_cursor(
  if (bp) {
  error = xfs_trans_read_buf(mp, tp, mp->m_ddev_targp,
    XFS_BUF_ADDR(bp), mp->m_bsize,
-   0, &bp, NULL);
+   0, &bp,
+   cur->bc_ops->read_verify);
  if (error) {
  xfs_btree_del_cursor(new, error);
  *ncur = NULL;
@@ -612,23 +613,24 @@ xfs_btree_offsets(
  * Get a buffer for the block, return it read in.
  * Long-form addressing.
  */
-int /* error */
+int
 xfs_btree_read_bufl(
- xfs_mount_t *mp, /* file system mount point */
- xfs_trans_t *tp, /* transaction pointer */
- xfs_fsblock_t fsbno, /* file system block number */
- uint lock, /* lock flags for read_buf */
- xfs_buf_t **bpp, /* buffer for fsbno */
- int refval) /* ref count value for buffer */
-{
- xfs_buf_t *bp; /* return value */
+ struct xfs_mount *mp, /* file system mount point */
+ struct xfs_trans *tp, /* transaction pointer */
+ xfs_fsblock_t fsbno, /* file system block number */
+ uint lock, /* lock flags for read_buf */
+ struct xfs_buf **bpp, /* buffer for fsbno */
+ int refval, /* ref count value for buffer */
+ xfs_buf_iodone_t verify)
+{
+ struct xfs_buf *bp; /* return value */
  xfs_daddr_t d; /* real disk block address */
- int error;
+ int error;
 
  ASSERT(fsbno != NULLFSBLOCK);
  d = XFS_FSB_TO_DADDR(mp, fsbno);
  error = xfs_trans_read_buf(mp, tp, mp->m_ddev_targp, d,
-   mp->m_bsize, lock, &bp, NULL);
+   mp->m_bsize, lock, &bp, verify);
  if (error)
  return error;
  ASSERT(!xfs_buf_geterror(bp));
@@ -645,15 +647,16 @@ xfs_btree_read_bufl(
 /* ARGSUSED */
 void
 xfs_btree_reada_bufl(
- xfs_mount_t *mp, /* file system mount point */
- xfs_fsblock_t fsbno, /* file system block number */
- xfs_extlen_t count) /* count of filesystem blocks */
+ struct xfs_mount *mp, /* file system mount point */
+ xfs_fsblock_t fsbno, /* file system block number */
+ xfs_extlen_t count, /* count of filesystem blocks */
+ xfs_buf_iodone_t verify)
 {
  xfs_daddr_t d;
 
  ASSERT(fsbno != NULLFSBLOCK);
  d = XFS_FSB_TO_DADDR(mp, fsbno);
- xfs_buf_readahead(mp->m_ddev_targp, d, mp->m_bsize * count, NULL);
+ xfs_buf_readahead(mp->m_ddev_targp, d, mp->m_bsize * count, verify);
 }
 
 /*
@@ -663,17 +666,18 @@ xfs_btree_reada_bufl(
 /* ARGSUSED */
 void
 xfs_btree_reada_bufs(
- xfs_mount_t *mp, /* file system mount point */
- xfs_agnumber_t agno, /* allocation group number */
- xfs_agblock_t agbno, /* allocation group block number */
- xfs_extlen_t count) /* count of filesystem blocks */
+ struct xfs_mount *mp, /* file system mount point */
+ xfs_agnumber_t agno, /* allocation group number */
+ xfs_agblock_t agbno, /* allocation group block number */
+ xfs_extlen_t count, /* count of filesystem blocks */
+ xfs_buf_iodone_t verify)
 {
  xfs_daddr_t d;
 
  ASSERT(agno != NULLAGNUMBER);
  ASSERT(agbno != NULLAGBLOCK);
  d = XFS_AGB_TO_DADDR(mp, agno, agbno);
- xfs_buf_readahead(mp->m_ddev_targp, d, mp->m_bsize * count, NULL);
+ xfs_buf_readahead(mp->m_ddev_targp, d, mp->m_bsize * count, verify);
 }
 
 STATIC int
@@ -687,12 +691,14 @@ xfs_btree_readahead_lblock(
  xfs_dfsbno_t right = be64_to_cpu(block->bb_u.l.bb_rightsib);
 
  if ((lr & XFS_BTCUR_LEFTRA) && left != NULLDFSBNO) {
- xfs_btree_reada_bufl(cur->bc_mp, left, 1);
+ xfs_btree_reada_bufl(cur->bc_mp, left, 1,
+     cur->bc_ops->read_verify);
  rval++;
  }
 
  if ((lr & XFS_BTCUR_RIGHTRA) && right != NULLDFSBNO) {
- xfs_btree_reada_bufl(cur->bc_mp, right, 1);
+ xfs_btree_reada_bufl(cur->bc_mp, right, 1,
+     cur->bc_ops->read_verify);
  rval++;
  }
 
@@ -712,13 +718,13 @@ xfs_btree_readahead_sblock(
 
  if ((lr & XFS_BTCUR_LEFTRA) && left != NULLAGBLOCK) {
  xfs_btree_reada_bufs(cur->bc_mp, cur->bc_private.a.agno,
-     left, 1);
+     left, 1, cur->bc_ops->read_verify);
  rval++;
  }
 
  if ((lr & XFS_BTCUR_RIGHTRA) && right != NULLAGBLOCK) {
  xfs_btree_reada_bufs(cur->bc_mp, cur->bc_private.a.agno,
-     right, 1);
+     right, 1, cur->bc_ops->read_verify);
  rval++;
  }
 
@@ -1016,19 +1022,15 @@ xfs_btree_read_buf_block(
 
  d = xfs_btree_ptr_to_daddr(cur, ptr);
  error = xfs_trans_read_buf(mp, cur->bc_tp, mp->m_ddev_targp, d,
-   mp->m_bsize, flags, bpp, NULL);
+   mp->m_bsize, flags, bpp,
+   cur->bc_ops->read_verify);
  if (error)
  return error;
 
  ASSERT(!xfs_buf_geterror(*bpp));
-
  xfs_btree_set_refs(cur, *bpp);
  *block = XFS_BUF_TO_BLOCK(*bpp);
-
- error = xfs_btree_check_block(cur, *block, level, *bpp);
- if (error)
- xfs_trans_brelse(cur->bc_tp, *bpp);
- return error;
+ return 0;
 }
 
 /*
diff --git a/fs/xfs/xfs_btree.h b/fs/xfs/xfs_btree.h
index c9cf2d0..3a4c314 100644
--- a/fs/xfs/xfs_btree.h
+++ b/fs/xfs/xfs_btree.h
@@ -188,6 +188,7 @@ struct xfs_btree_ops {
  __int64_t (*key_diff)(struct xfs_btree_cur *cur,
       union xfs_btree_key *key);
 
+ void (*read_verify)(struct xfs_buf *bp);
 #ifdef DEBUG
  /* check that k1 is lower than k2 */
  int (*keys_inorder)(struct xfs_btree_cur *cur,
@@ -355,7 +356,8 @@ xfs_btree_read_bufl(
  xfs_fsblock_t fsbno, /* file system block number */
  uint lock, /* lock flags for read_buf */
  struct xfs_buf **bpp, /* buffer for fsbno */
- int refval);/* ref count value for buffer */
+ int refval, /* ref count value for buffer */
+ xfs_buf_iodone_t verify);
 
 /*
  * Read-ahead the block, don't wait for it, don't return a buffer.
@@ -365,7 +367,8 @@ void /* error */
 xfs_btree_reada_bufl(
  struct xfs_mount *mp, /* file system mount point */
  xfs_fsblock_t fsbno, /* file system block number */
- xfs_extlen_t count); /* count of filesystem blocks */
+ xfs_extlen_t count, /* count of filesystem blocks */
+ xfs_buf_iodone_t verify);
 
 /*
  * Read-ahead the block, don't wait for it, don't return a buffer.
@@ -376,7 +379,8 @@ xfs_btree_reada_bufs(
  struct xfs_mount *mp, /* file system mount point */
  xfs_agnumber_t agno, /* allocation group number */
  xfs_agblock_t agbno, /* allocation group block number */
- xfs_extlen_t count); /* count of filesystem blocks */
+ xfs_extlen_t count, /* count of filesystem blocks */
+ xfs_buf_iodone_t verify);
 
 /*
  * Initialise a new btree block header
diff --git a/fs/xfs/xfs_ialloc_btree.c b/fs/xfs/xfs_ialloc_btree.c
index 2b8b7a3..11306c6 100644
--- a/fs/xfs/xfs_ialloc_btree.c
+++ b/fs/xfs/xfs_ialloc_btree.c
@@ -33,6 +33,7 @@
 #include "xfs_ialloc.h"
 #include "xfs_alloc.h"
 #include "xfs_error.h"
+#include "xfs_trace.h"
 
 
 STATIC int
@@ -181,6 +182,44 @@ xfs_inobt_key_diff(
   cur->bc_rec.i.ir_startino;
 }
 
+void
+xfs_inobt_read_verify(
+ struct xfs_buf *bp)
+{
+ struct xfs_mount *mp = bp->b_target->bt_mount;
+ struct xfs_btree_block *block = XFS_BUF_TO_BLOCK(bp);
+ unsigned int level;
+ int sblock_ok; /* block passes checks */
+
+ /* magic number and level verification */
+ level = be16_to_cpu(block->bb_level);
+ sblock_ok = block->bb_magic == cpu_to_be32(XFS_IBT_MAGIC) &&
+    level < mp->m_in_maxlevels;
+
+ /* numrecs verification */
+ sblock_ok = sblock_ok &&
+ be16_to_cpu(block->bb_numrecs) <= mp->m_inobt_mxr[level != 0];
+
+ /* sibling pointer verification */
+ sblock_ok = sblock_ok &&
+ (block->bb_u.s.bb_leftsib == cpu_to_be32(NULLAGBLOCK) ||
+ be32_to_cpu(block->bb_u.s.bb_leftsib) < mp->m_sb.sb_agblocks) &&
+ block->bb_u.s.bb_leftsib &&
+ (block->bb_u.s.bb_rightsib == cpu_to_be32(NULLAGBLOCK) ||
+ be32_to_cpu(block->bb_u.s.bb_rightsib) < mp->m_sb.sb_agblocks) &&
+ block->bb_u.s.bb_rightsib;
+
+ if (!sblock_ok) {
+ trace_xfs_btree_corrupt(bp, _RET_IP_);
+ XFS_CORRUPTION_ERROR("xfs_inobt_read_verify",
+ XFS_ERRLEVEL_LOW, mp, block);
+ xfs_buf_ioerror(bp, EFSCORRUPTED);
+ }
+
+ bp->b_iodone = NULL;
+ xfs_buf_ioend(bp, 0);
+}
+
 #ifdef DEBUG
 STATIC int
 xfs_inobt_keys_inorder(
@@ -218,6 +257,7 @@ static const struct xfs_btree_ops xfs_inobt_ops = {
  .init_rec_from_cur = xfs_inobt_init_rec_from_cur,
  .init_ptr_from_cur = xfs_inobt_init_ptr_from_cur,
  .key_diff = xfs_inobt_key_diff,
+ .read_verify = xfs_inobt_read_verify,
 #ifdef DEBUG
  .keys_inorder = xfs_inobt_keys_inorder,
  .recs_inorder = xfs_inobt_recs_inorder,
diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
index 514eac9..3a243d0 100644
--- a/fs/xfs/xfs_inode.c
+++ b/fs/xfs/xfs_inode.c
@@ -382,7 +382,7 @@ xfs_inobp_check(
 }
 #endif
 
-static void
+void
 xfs_inode_buf_verify(
  struct xfs_buf *bp)
 {
diff --git a/fs/xfs/xfs_inode.h b/fs/xfs/xfs_inode.h
index 21b4de3..1a89211 100644
--- a/fs/xfs/xfs_inode.h
+++ b/fs/xfs/xfs_inode.h
@@ -554,6 +554,7 @@ int xfs_imap_to_bp(struct xfs_mount *, struct xfs_trans *,
        struct xfs_buf **, uint, uint);
 int xfs_iread(struct xfs_mount *, struct xfs_trans *,
   struct xfs_inode *, uint);
+void xfs_inode_buf_verify(struct xfs_buf *);
 void xfs_dinode_to_disk(struct xfs_dinode *,
    struct xfs_icdinode *);
 void xfs_idestroy_fork(struct xfs_inode *, int);
diff --git a/fs/xfs/xfs_itable.c b/fs/xfs/xfs_itable.c
index 3998fd2..0f18d41 100644
--- a/fs/xfs/xfs_itable.c
+++ b/fs/xfs/xfs_itable.c
@@ -396,7 +396,8 @@ xfs_bulkstat(
  if (xfs_inobt_maskn(chunkidx, nicluster)
  & ~r.ir_free)
  xfs_btree_reada_bufs(mp, agno,
- agbno, nbcluster);
+ agbno, nbcluster,
+ xfs_inode_buf_verify);
  }
  irbp->ir_startino = r.ir_startino;
  irbp->ir_freecount = r.ir_freecount;
--
1.7.10

_______________________________________________
xfs mailing list
[hidden email]
http://oss.sgi.com/mailman/listinfo/xfs
Reply | Threaded
Open this post in threaded view
|

[PATCH 17/32] xfs: verify dquot blocks as they are read from disk

Dave Chinner
In reply to this post by Dave Chinner
From: Dave Chinner <[hidden email]>

Add a dquot buffer verify callback function and pass it into the
buffer read functions. This checks all the dquots in a buffer, but
cannot completely verify the dquot ids are correct. Also, errors
cannot be repaired, so an additional function is added to repair bad
dquots in the buffer if such an error is detected in a context where
repair is allowed.

Signed-off-by: Dave Chinner <[hidden email]>
Reviewed-by: Phil White <[hidden email]>
---
 fs/xfs/xfs_dquot.c |  117 ++++++++++++++++++++++++++++++++++++++++++----------
 1 file changed, 95 insertions(+), 22 deletions(-)

diff --git a/fs/xfs/xfs_dquot.c b/fs/xfs/xfs_dquot.c
index e95f800..2e18382 100644
--- a/fs/xfs/xfs_dquot.c
+++ b/fs/xfs/xfs_dquot.c
@@ -360,6 +360,89 @@ xfs_qm_dqalloc(
  return (error);
 }
 
+STATIC void
+xfs_dquot_read_verify(
+ struct xfs_buf *bp)
+{
+ struct xfs_mount *mp = bp->b_target->bt_mount;
+ struct xfs_dqblk *d = (struct xfs_dqblk *)bp->b_addr;
+ struct xfs_disk_dquot *ddq;
+ xfs_dqid_t id = 0;
+ int i;
+
+ /*
+ * On the first read of the buffer, verify that each dquot is valid.
+ * We don't know what the id of the dquot is supposed to be, just that
+ * they should be increasing monotonically within the buffer. If the
+ * first id is corrupt, then it will fail on the second dquot in the
+ * buffer so corruptions could point to the wrong dquot in this case.
+ */
+ for (i = 0; i < mp->m_quotainfo->qi_dqperchunk; i++) {
+ int error;
+
+ ddq = &d[i].dd_diskdq;
+
+ if (i == 0)
+ id = be32_to_cpu(ddq->d_id);
+
+ error = xfs_qm_dqcheck(mp, ddq, id + i, 0, XFS_QMOPT_DOWARN,
+ "xfs_dquot_read_verify");
+ if (error) {
+ XFS_CORRUPTION_ERROR("xfs_dquot_read_verify",
+     XFS_ERRLEVEL_LOW, mp, d);
+ xfs_buf_ioerror(bp, EFSCORRUPTED);
+ break;
+ }
+ }
+ bp->b_iodone = NULL;
+ xfs_buf_ioend(bp, 0);
+}
+
+STATIC int
+xfs_qm_dqrepair(
+ struct xfs_mount *mp,
+ struct xfs_trans *tp,
+ struct xfs_dquot *dqp,
+ xfs_dqid_t firstid,
+ struct xfs_buf **bpp)
+{
+ int error;
+ struct xfs_disk_dquot *ddq;
+ struct xfs_dqblk *d;
+ int i;
+
+ /*
+ * Read the buffer without verification so we get the corrupted
+ * buffer returned to us.
+ */
+ error = xfs_trans_read_buf(mp, tp, mp->m_ddev_targp, dqp->q_blkno,
+   mp->m_quotainfo->qi_dqchunklen,
+   0, bpp, NULL);
+
+ if (error) {
+ ASSERT(*bpp == NULL);
+ return XFS_ERROR(error);
+ }
+
+ ASSERT(xfs_buf_islocked(*bpp));
+ d = (struct xfs_dqblk *)(*bpp)->b_addr;
+
+ /* Do the actual repair of dquots in this buffer */
+ for (i = 0; i < mp->m_quotainfo->qi_dqperchunk; i++) {
+ ddq = &d[i].dd_diskdq;
+ error = xfs_qm_dqcheck(mp, ddq, firstid + i,
+       dqp->dq_flags & XFS_DQ_ALLTYPES,
+       XFS_QMOPT_DQREPAIR, "xfs_qm_dqrepair");
+ if (error) {
+ /* repair failed, we're screwed */
+ xfs_trans_brelse(tp, *bpp);
+ return XFS_ERROR(EIO);
+ }
+ }
+
+ return 0;
+}
+
 /*
  * Maps a dquot to the buffer containing its on-disk version.
  * This returns a ptr to the buffer containing the on-disk dquot
@@ -378,7 +461,6 @@ xfs_qm_dqtobp(
  xfs_buf_t *bp;
  xfs_inode_t *quotip = XFS_DQ_TO_QIP(dqp);
  xfs_mount_t *mp = dqp->q_mount;
- xfs_disk_dquot_t *ddq;
  xfs_dqid_t id = be32_to_cpu(dqp->q_core.d_id);
  xfs_trans_t *tp = (tpp ? *tpp : NULL);
 
@@ -439,33 +521,24 @@ xfs_qm_dqtobp(
  error = xfs_trans_read_buf(mp, tp, mp->m_ddev_targp,
    dqp->q_blkno,
    mp->m_quotainfo->qi_dqchunklen,
-   0, &bp, NULL);
- if (error || !bp)
- return XFS_ERROR(error);
- }
+   0, &bp, xfs_dquot_read_verify);
 
- ASSERT(xfs_buf_islocked(bp));
-
- /*
- * calculate the location of the dquot inside the buffer.
- */
- ddq = bp->b_addr + dqp->q_bufoffset;
+ if (error == EFSCORRUPTED && (flags & XFS_QMOPT_DQREPAIR)) {
+ xfs_dqid_t firstid = (xfs_dqid_t)map.br_startoff *
+ mp->m_quotainfo->qi_dqperchunk;
+ ASSERT(bp == NULL);
+ error = xfs_qm_dqrepair(mp, tp, dqp, firstid, &bp);
+ }
 
- /*
- * A simple sanity check in case we got a corrupted dquot...
- */
- error = xfs_qm_dqcheck(mp, ddq, id, dqp->dq_flags & XFS_DQ_ALLTYPES,
-   flags & (XFS_QMOPT_DQREPAIR|XFS_QMOPT_DOWARN),
-   "dqtobp");
- if (error) {
- if (!(flags & XFS_QMOPT_DQREPAIR)) {
- xfs_trans_brelse(tp, bp);
- return XFS_ERROR(EIO);
+ if (error) {
+ ASSERT(bp == NULL);
+ return XFS_ERROR(error);
  }
  }
 
+ ASSERT(xfs_buf_islocked(bp));
  *O_bpp = bp;
- *O_ddpp = ddq;
+ *O_ddpp = bp->b_addr + dqp->q_bufoffset;
 
  return (0);
 }
--
1.7.10

_______________________________________________
xfs mailing list
[hidden email]
http://oss.sgi.com/mailman/listinfo/xfs
Reply | Threaded
Open this post in threaded view
|

[PATCH 18/32] xfs: add verifier callback to directory read code

Dave Chinner
In reply to this post by Dave Chinner
From: Dave Chinner <[hidden email]>

Signed-off-by: Dave Chinner <[hidden email]>
Reviewed-by: Christoph Hellwig <[hidden email]>
Reviewed-by: Phil White <[hidden email]>
---
 fs/xfs/xfs_attr.c       |   23 ++++++++++++-----------
 fs/xfs/xfs_attr_leaf.c  |   18 +++++++++---------
 fs/xfs/xfs_da_btree.c   |   44 ++++++++++++++++++++++++++++----------------
 fs/xfs/xfs_da_btree.h   |    7 ++++---
 fs/xfs/xfs_dir2_block.c |   23 ++++++++++++-----------
 fs/xfs/xfs_dir2_leaf.c  |   33 ++++++++++++++++-----------------
 fs/xfs/xfs_dir2_node.c  |   43 ++++++++++++++++++++-----------------------
 fs/xfs/xfs_file.c       |    2 +-
 8 files changed, 102 insertions(+), 91 deletions(-)

diff --git a/fs/xfs/xfs_attr.c b/fs/xfs/xfs_attr.c
index 474c57a..cd5a9cd 100644
--- a/fs/xfs/xfs_attr.c
+++ b/fs/xfs/xfs_attr.c
@@ -904,7 +904,7 @@ xfs_attr_leaf_addname(xfs_da_args_t *args)
  dp = args->dp;
  args->blkno = 0;
  error = xfs_da_read_buf(args->trans, args->dp, args->blkno, -1, &bp,
-     XFS_ATTR_FORK);
+     XFS_ATTR_FORK, NULL);
  if (error)
  return(error);
  ASSERT(bp != NULL);
@@ -1032,7 +1032,7 @@ xfs_attr_leaf_addname(xfs_da_args_t *args)
  * remove the "old" attr from that block (neat, huh!)
  */
  error = xfs_da_read_buf(args->trans, args->dp, args->blkno, -1,
-     &bp, XFS_ATTR_FORK);
+     &bp, XFS_ATTR_FORK, NULL);
  if (error)
  return(error);
  ASSERT(bp != NULL);
@@ -1101,7 +1101,7 @@ xfs_attr_leaf_removename(xfs_da_args_t *args)
  dp = args->dp;
  args->blkno = 0;
  error = xfs_da_read_buf(args->trans, args->dp, args->blkno, -1, &bp,
-     XFS_ATTR_FORK);
+     XFS_ATTR_FORK, NULL);
  if (error) {
  return(error);
  }
@@ -1159,7 +1159,7 @@ xfs_attr_leaf_get(xfs_da_args_t *args)
 
  args->blkno = 0;
  error = xfs_da_read_buf(args->trans, args->dp, args->blkno, -1, &bp,
-     XFS_ATTR_FORK);
+     XFS_ATTR_FORK, NULL);
  if (error)
  return(error);
  ASSERT(bp != NULL);
@@ -1190,7 +1190,8 @@ xfs_attr_leaf_list(xfs_attr_list_context_t *context)
  trace_xfs_attr_leaf_list(context);
 
  context->cursor->blkno = 0;
- error = xfs_da_read_buf(NULL, context->dp, 0, -1, &bp, XFS_ATTR_FORK);
+ error = xfs_da_read_buf(NULL, context->dp, 0, -1, &bp, XFS_ATTR_FORK,
+ NULL);
  if (error)
  return XFS_ERROR(error);
  ASSERT(bp != NULL);
@@ -1605,7 +1606,7 @@ xfs_attr_node_removename(xfs_da_args_t *args)
  state->path.blk[0].bp = NULL;
 
  error = xfs_da_read_buf(args->trans, args->dp, 0, -1, &bp,
-     XFS_ATTR_FORK);
+     XFS_ATTR_FORK, NULL);
  if (error)
  goto out;
  ASSERT((((xfs_attr_leafblock_t *)bp->b_addr)->hdr.info.magic) ==
@@ -1718,7 +1719,7 @@ xfs_attr_refillstate(xfs_da_state_t *state)
  error = xfs_da_read_buf(state->args->trans,
  state->args->dp,
  blk->blkno, blk->disk_blkno,
- &blk->bp, XFS_ATTR_FORK);
+ &blk->bp, XFS_ATTR_FORK, NULL);
  if (error)
  return(error);
  } else {
@@ -1737,7 +1738,7 @@ xfs_attr_refillstate(xfs_da_state_t *state)
  error = xfs_da_read_buf(state->args->trans,
  state->args->dp,
  blk->blkno, blk->disk_blkno,
- &blk->bp, XFS_ATTR_FORK);
+ &blk->bp, XFS_ATTR_FORK, NULL);
  if (error)
  return(error);
  } else {
@@ -1827,7 +1828,7 @@ xfs_attr_node_list(xfs_attr_list_context_t *context)
  bp = NULL;
  if (cursor->blkno > 0) {
  error = xfs_da_read_buf(NULL, context->dp, cursor->blkno, -1,
-      &bp, XFS_ATTR_FORK);
+      &bp, XFS_ATTR_FORK, NULL);
  if ((error != 0) && (error != EFSCORRUPTED))
  return(error);
  if (bp) {
@@ -1870,7 +1871,7 @@ xfs_attr_node_list(xfs_attr_list_context_t *context)
  for (;;) {
  error = xfs_da_read_buf(NULL, context->dp,
       cursor->blkno, -1, &bp,
-      XFS_ATTR_FORK);
+      XFS_ATTR_FORK, NULL);
  if (error)
  return(error);
  if (unlikely(bp == NULL)) {
@@ -1937,7 +1938,7 @@ xfs_attr_node_list(xfs_attr_list_context_t *context)
  cursor->blkno = be32_to_cpu(leaf->hdr.info.forw);
  xfs_trans_brelse(NULL, bp);
  error = xfs_da_read_buf(NULL, context->dp, cursor->blkno, -1,
-      &bp, XFS_ATTR_FORK);
+      &bp, XFS_ATTR_FORK, NULL);
  if (error)
  return(error);
  if (unlikely((bp == NULL))) {
diff --git a/fs/xfs/xfs_attr_leaf.c b/fs/xfs/xfs_attr_leaf.c
index 4bfc732..ba2b9a2 100644
--- a/fs/xfs/xfs_attr_leaf.c
+++ b/fs/xfs/xfs_attr_leaf.c
@@ -871,7 +871,7 @@ xfs_attr_leaf_to_node(xfs_da_args_t *args)
  if (error)
  goto out;
  error = xfs_da_read_buf(args->trans, args->dp, 0, -1, &bp1,
-     XFS_ATTR_FORK);
+     XFS_ATTR_FORK, NULL);
  if (error)
  goto out;
  ASSERT(bp1 != NULL);
@@ -1642,7 +1642,7 @@ xfs_attr_leaf_toosmall(xfs_da_state_t *state, int *action)
  if (blkno == 0)
  continue;
  error = xfs_da_read_buf(state->args->trans, state->args->dp,
- blkno, -1, &bp, XFS_ATTR_FORK);
+ blkno, -1, &bp, XFS_ATTR_FORK, NULL);
  if (error)
  return(error);
  ASSERT(bp != NULL);
@@ -2519,7 +2519,7 @@ xfs_attr_leaf_clearflag(xfs_da_args_t *args)
  * Set up the operation.
  */
  error = xfs_da_read_buf(args->trans, args->dp, args->blkno, -1, &bp,
-     XFS_ATTR_FORK);
+     XFS_ATTR_FORK, NULL);
  if (error) {
  return(error);
  }
@@ -2584,7 +2584,7 @@ xfs_attr_leaf_setflag(xfs_da_args_t *args)
  * Set up the operation.
  */
  error = xfs_da_read_buf(args->trans, args->dp, args->blkno, -1, &bp,
-     XFS_ATTR_FORK);
+     XFS_ATTR_FORK, NULL);
  if (error) {
  return(error);
  }
@@ -2641,7 +2641,7 @@ xfs_attr_leaf_flipflags(xfs_da_args_t *args)
  * Read the block containing the "old" attr
  */
  error = xfs_da_read_buf(args->trans, args->dp, args->blkno, -1, &bp1,
-     XFS_ATTR_FORK);
+     XFS_ATTR_FORK, NULL);
  if (error) {
  return(error);
  }
@@ -2652,7 +2652,7 @@ xfs_attr_leaf_flipflags(xfs_da_args_t *args)
  */
  if (args->blkno2 != args->blkno) {
  error = xfs_da_read_buf(args->trans, args->dp, args->blkno2,
- -1, &bp2, XFS_ATTR_FORK);
+ -1, &bp2, XFS_ATTR_FORK, NULL);
  if (error) {
  return(error);
  }
@@ -2753,7 +2753,7 @@ xfs_attr_root_inactive(xfs_trans_t **trans, xfs_inode_t *dp)
  * the extents in reverse order the extent containing
  * block 0 must still be there.
  */
- error = xfs_da_read_buf(*trans, dp, 0, -1, &bp, XFS_ATTR_FORK);
+ error = xfs_da_read_buf(*trans, dp, 0, -1, &bp, XFS_ATTR_FORK, NULL);
  if (error)
  return(error);
  blkno = XFS_BUF_ADDR(bp);
@@ -2839,7 +2839,7 @@ xfs_attr_node_inactive(
  * before we come back to this one.
  */
  error = xfs_da_read_buf(*trans, dp, child_fsb, -2, &child_bp,
- XFS_ATTR_FORK);
+ XFS_ATTR_FORK, NULL);
  if (error)
  return(error);
  if (child_bp) {
@@ -2880,7 +2880,7 @@ xfs_attr_node_inactive(
  */
  if ((i+1) < count) {
  error = xfs_da_read_buf(*trans, dp, 0, parent_blkno,
- &bp, XFS_ATTR_FORK);
+ &bp, XFS_ATTR_FORK, NULL);
  if (error)
  return(error);
  child_fsb = be32_to_cpu(node->btree[i+1].before);
diff --git a/fs/xfs/xfs_da_btree.c b/fs/xfs/xfs_da_btree.c
index 4af8bad..f9e9149 100644
--- a/fs/xfs/xfs_da_btree.c
+++ b/fs/xfs/xfs_da_btree.c
@@ -747,7 +747,7 @@ xfs_da_root_join(xfs_da_state_t *state, xfs_da_state_blk_t *root_blk)
  child = be32_to_cpu(oldroot->btree[0].before);
  ASSERT(child != 0);
  error = xfs_da_read_buf(args->trans, args->dp, child, -1, &bp,
-     args->whichfork);
+     args->whichfork, NULL);
  if (error)
  return(error);
  ASSERT(bp != NULL);
@@ -838,7 +838,8 @@ xfs_da_node_toosmall(xfs_da_state_t *state, int *action)
  if (blkno == 0)
  continue;
  error = xfs_da_read_buf(state->args->trans, state->args->dp,
- blkno, -1, &bp, state->args->whichfork);
+ blkno, -1, &bp, state->args->whichfork,
+ NULL);
  if (error)
  return(error);
  ASSERT(bp != NULL);
@@ -1084,7 +1085,7 @@ xfs_da_node_lookup_int(xfs_da_state_t *state, int *result)
  */
  blk->blkno = blkno;
  error = xfs_da_read_buf(args->trans, args->dp, blkno,
- -1, &blk->bp, args->whichfork);
+ -1, &blk->bp, args->whichfork, NULL);
  if (error) {
  blk->blkno = 0;
  state->path.active--;
@@ -1247,7 +1248,7 @@ xfs_da_blk_link(xfs_da_state_t *state, xfs_da_state_blk_t *old_blk,
  if (old_info->back) {
  error = xfs_da_read_buf(args->trans, args->dp,
  be32_to_cpu(old_info->back),
- -1, &bp, args->whichfork);
+ -1, &bp, args->whichfork, NULL);
  if (error)
  return(error);
  ASSERT(bp != NULL);
@@ -1268,7 +1269,7 @@ xfs_da_blk_link(xfs_da_state_t *state, xfs_da_state_blk_t *old_blk,
  if (old_info->forw) {
  error = xfs_da_read_buf(args->trans, args->dp,
  be32_to_cpu(old_info->forw),
- -1, &bp, args->whichfork);
+ -1, &bp, args->whichfork, NULL);
  if (error)
  return(error);
  ASSERT(bp != NULL);
@@ -1368,7 +1369,7 @@ xfs_da_blk_unlink(xfs_da_state_t *state, xfs_da_state_blk_t *drop_blk,
  if (drop_info->back) {
  error = xfs_da_read_buf(args->trans, args->dp,
  be32_to_cpu(drop_info->back),
- -1, &bp, args->whichfork);
+ -1, &bp, args->whichfork, NULL);
  if (error)
  return(error);
  ASSERT(bp != NULL);
@@ -1385,7 +1386,7 @@ xfs_da_blk_unlink(xfs_da_state_t *state, xfs_da_state_blk_t *drop_blk,
  if (drop_info->forw) {
  error = xfs_da_read_buf(args->trans, args->dp,
  be32_to_cpu(drop_info->forw),
- -1, &bp, args->whichfork);
+ -1, &bp, args->whichfork, NULL);
  if (error)
  return(error);
  ASSERT(bp != NULL);
@@ -1470,7 +1471,7 @@ xfs_da_path_shift(xfs_da_state_t *state, xfs_da_state_path_t *path,
  */
  blk->blkno = blkno;
  error = xfs_da_read_buf(args->trans, args->dp, blkno, -1,
-     &blk->bp, args->whichfork);
+ &blk->bp, args->whichfork, NULL);
  if (error)
  return(error);
  ASSERT(blk->bp != NULL);
@@ -1733,7 +1734,8 @@ xfs_da_swap_lastblock(
  * Read the last block in the btree space.
  */
  last_blkno = (xfs_dablk_t)lastoff - mp->m_dirblkfsbs;
- if ((error = xfs_da_read_buf(tp, ip, last_blkno, -1, &last_buf, w)))
+ error = xfs_da_read_buf(tp, ip, last_blkno, -1, &last_buf, w, NULL);
+ if (error)
  return error;
  /*
  * Copy the last block into the dead buffer and log it.
@@ -1759,7 +1761,9 @@ xfs_da_swap_lastblock(
  * If the moved block has a left sibling, fix up the pointers.
  */
  if ((sib_blkno = be32_to_cpu(dead_info->back))) {
- if ((error = xfs_da_read_buf(tp, ip, sib_blkno, -1, &sib_buf, w)))
+ error = xfs_da_read_buf(tp, ip, sib_blkno, -1, &sib_buf, w,
+ NULL);
+ if (error)
  goto done;
  sib_info = sib_buf->b_addr;
  if (unlikely(
@@ -1780,7 +1784,9 @@ xfs_da_swap_lastblock(
  * If the moved block has a right sibling, fix up the pointers.
  */
  if ((sib_blkno = be32_to_cpu(dead_info->forw))) {
- if ((error = xfs_da_read_buf(tp, ip, sib_blkno, -1, &sib_buf, w)))
+ error = xfs_da_read_buf(tp, ip, sib_blkno, -1, &sib_buf, w,
+ NULL);
+ if (error)
  goto done;
  sib_info = sib_buf->b_addr;
  if (unlikely(
@@ -1803,7 +1809,9 @@ xfs_da_swap_lastblock(
  * Walk down the tree looking for the parent of the moved block.
  */
  for (;;) {
- if ((error = xfs_da_read_buf(tp, ip, par_blkno, -1, &par_buf, w)))
+ error = xfs_da_read_buf(tp, ip, par_blkno, -1, &par_buf, w,
+ NULL);
+ if (error)
  goto done;
  par_node = par_buf->b_addr;
  if (unlikely(par_node->hdr.info.magic !=
@@ -1853,7 +1861,9 @@ xfs_da_swap_lastblock(
  error = XFS_ERROR(EFSCORRUPTED);
  goto done;
  }
- if ((error = xfs_da_read_buf(tp, ip, par_blkno, -1, &par_buf, w)))
+ error = xfs_da_read_buf(tp, ip, par_blkno, -1, &par_buf, w,
+ NULL);
+ if (error)
  goto done;
  par_node = par_buf->b_addr;
  if (unlikely(
@@ -2139,7 +2149,8 @@ xfs_da_read_buf(
  xfs_dablk_t bno,
  xfs_daddr_t mappedbno,
  struct xfs_buf **bpp,
- int whichfork)
+ int whichfork,
+ xfs_buf_iodone_t verifier)
 {
  struct xfs_buf *bp;
  struct xfs_buf_map map;
@@ -2161,7 +2172,7 @@ xfs_da_read_buf(
 
  error = xfs_trans_read_buf_map(dp->i_mount, trans,
  dp->i_mount->m_ddev_targp,
- mapp, nmap, 0, &bp, NULL);
+ mapp, nmap, 0, &bp, verifier);
  if (error)
  goto out_free;
 
@@ -2217,7 +2228,8 @@ xfs_da_reada_buf(
  struct xfs_trans *trans,
  struct xfs_inode *dp,
  xfs_dablk_t bno,
- int whichfork)
+ int whichfork,
+ xfs_buf_iodone_t verifier)
 {
  xfs_daddr_t mappedbno = -1;
  struct xfs_buf_map map;
diff --git a/fs/xfs/xfs_da_btree.h b/fs/xfs/xfs_da_btree.h
index 132adaf..bf8bfaa 100644
--- a/fs/xfs/xfs_da_btree.h
+++ b/fs/xfs/xfs_da_btree.h
@@ -18,7 +18,6 @@
 #ifndef __XFS_DA_BTREE_H__
 #define __XFS_DA_BTREE_H__
 
-struct xfs_buf;
 struct xfs_bmap_free;
 struct xfs_inode;
 struct xfs_mount;
@@ -226,9 +225,11 @@ int xfs_da_get_buf(struct xfs_trans *trans, struct xfs_inode *dp,
       struct xfs_buf **bp, int whichfork);
 int xfs_da_read_buf(struct xfs_trans *trans, struct xfs_inode *dp,
        xfs_dablk_t bno, xfs_daddr_t mappedbno,
-       struct xfs_buf **bpp, int whichfork);
+       struct xfs_buf **bpp, int whichfork,
+       xfs_buf_iodone_t verifier);
 xfs_daddr_t xfs_da_reada_buf(struct xfs_trans *trans, struct xfs_inode *dp,
- xfs_dablk_t bno, int whichfork);
+ xfs_dablk_t bno, int whichfork,
+ xfs_buf_iodone_t verifier);
 int xfs_da_shrink_inode(xfs_da_args_t *args, xfs_dablk_t dead_blkno,
   struct xfs_buf *dead_buf);
 
diff --git a/fs/xfs/xfs_dir2_block.c b/fs/xfs/xfs_dir2_block.c
index e93ca8f..53666ca 100644
--- a/fs/xfs/xfs_dir2_block.c
+++ b/fs/xfs/xfs_dir2_block.c
@@ -97,10 +97,10 @@ xfs_dir2_block_addname(
  /*
  * Read the (one and only) directory block into dabuf bp.
  */
- if ((error =
-    xfs_da_read_buf(tp, dp, mp->m_dirdatablk, -1, &bp, XFS_DATA_FORK))) {
+ error = xfs_da_read_buf(tp, dp, mp->m_dirdatablk, -1, &bp,
+ XFS_DATA_FORK, NULL);
+ if (error)
  return error;
- }
  ASSERT(bp != NULL);
  hdr = bp->b_addr;
  /*
@@ -457,7 +457,7 @@ xfs_dir2_block_getdents(
  * Can't read the block, give up, else get dabuf in bp.
  */
  error = xfs_da_read_buf(NULL, dp, mp->m_dirdatablk, -1,
- &bp, XFS_DATA_FORK);
+ &bp, XFS_DATA_FORK, NULL);
  if (error)
  return error;
 
@@ -640,10 +640,10 @@ xfs_dir2_block_lookup_int(
  /*
  * Read the buffer, return error if we can't get it.
  */
- if ((error =
-    xfs_da_read_buf(tp, dp, mp->m_dirdatablk, -1, &bp, XFS_DATA_FORK))) {
+ error = xfs_da_read_buf(tp, dp, mp->m_dirdatablk, -1, &bp,
+ XFS_DATA_FORK, NULL);
+ if (error)
  return error;
- }
  ASSERT(bp != NULL);
  hdr = bp->b_addr;
  xfs_dir2_data_check(dp, bp);
@@ -917,10 +917,11 @@ xfs_dir2_leaf_to_block(
  /*
  * Read the data block if we don't already have it, give up if it fails.
  */
- if (dbp == NULL &&
-    (error = xfs_da_read_buf(tp, dp, mp->m_dirdatablk, -1, &dbp,
-    XFS_DATA_FORK))) {
- return error;
+ if (!dbp) {
+ error = xfs_da_read_buf(tp, dp, mp->m_dirdatablk, -1, &dbp,
+ XFS_DATA_FORK, NULL);
+ if (error)
+ return error;
  }
  hdr = dbp->b_addr;
  ASSERT(hdr->magic == cpu_to_be32(XFS_DIR2_DATA_MAGIC));
diff --git a/fs/xfs/xfs_dir2_leaf.c b/fs/xfs/xfs_dir2_leaf.c
index bac8698..86e3dc1 100644
--- a/fs/xfs/xfs_dir2_leaf.c
+++ b/fs/xfs/xfs_dir2_leaf.c
@@ -315,10 +315,9 @@ xfs_dir2_leaf_addname(
  * Read the leaf block.
  */
  error = xfs_da_read_buf(tp, dp, mp->m_dirleafblk, -1, &lbp,
- XFS_DATA_FORK);
- if (error) {
+ XFS_DATA_FORK, NULL);
+ if (error)
  return error;
- }
  ASSERT(lbp != NULL);
  /*
  * Look up the entry by hash value and name.
@@ -500,9 +499,9 @@ xfs_dir2_leaf_addname(
  * Just read that one in.
  */
  else {
- if ((error =
-    xfs_da_read_buf(tp, dp, xfs_dir2_db_to_da(mp, use_block),
-    -1, &dbp, XFS_DATA_FORK))) {
+ error = xfs_da_read_buf(tp, dp, xfs_dir2_db_to_da(mp, use_block),
+ -1, &dbp, XFS_DATA_FORK, NULL);
+ if (error) {
  xfs_trans_brelse(tp, lbp);
  return error;
  }
@@ -895,7 +894,7 @@ xfs_dir2_leaf_readbuf(
  error = xfs_da_read_buf(NULL, dp, map->br_startoff,
  map->br_blockcount >= mp->m_dirblkfsbs ?
     XFS_FSB_TO_DADDR(mp, map->br_startblock) : -1,
- &bp, XFS_DATA_FORK);
+ &bp, XFS_DATA_FORK, NULL);
 
  /*
  * Should just skip over the data block instead of giving up.
@@ -938,7 +937,7 @@ xfs_dir2_leaf_readbuf(
  xfs_da_reada_buf(NULL, dp,
  map[mip->ra_index].br_startoff +
  mip->ra_offset,
- XFS_DATA_FORK);
+ XFS_DATA_FORK, NULL);
  mip->ra_current = i;
  }
 
@@ -1376,7 +1375,7 @@ xfs_dir2_leaf_lookup_int(
  * Read the leaf block into the buffer.
  */
  error = xfs_da_read_buf(tp, dp, mp->m_dirleafblk, -1, &lbp,
- XFS_DATA_FORK);
+ XFS_DATA_FORK, NULL);
  if (error)
  return error;
  *lbpp = lbp;
@@ -1411,7 +1410,7 @@ xfs_dir2_leaf_lookup_int(
  xfs_trans_brelse(tp, dbp);
  error = xfs_da_read_buf(tp, dp,
  xfs_dir2_db_to_da(mp, newdb),
- -1, &dbp, XFS_DATA_FORK);
+ -1, &dbp, XFS_DATA_FORK, NULL);
  if (error) {
  xfs_trans_brelse(tp, lbp);
  return error;
@@ -1453,7 +1452,7 @@ xfs_dir2_leaf_lookup_int(
  xfs_trans_brelse(tp, dbp);
  error = xfs_da_read_buf(tp, dp,
  xfs_dir2_db_to_da(mp, cidb),
- -1, &dbp, XFS_DATA_FORK);
+ -1, &dbp, XFS_DATA_FORK, NULL);
  if (error) {
  xfs_trans_brelse(tp, lbp);
  return error;
@@ -1738,10 +1737,10 @@ xfs_dir2_leaf_trim_data(
  /*
  * Read the offending data block.  We need its buffer.
  */
- if ((error = xfs_da_read_buf(tp, dp, xfs_dir2_db_to_da(mp, db), -1, &dbp,
- XFS_DATA_FORK))) {
+ error = xfs_da_read_buf(tp, dp, xfs_dir2_db_to_da(mp, db), -1, &dbp,
+ XFS_DATA_FORK, NULL);
+ if (error)
  return error;
- }
 
  leaf = lbp->b_addr;
  ltp = xfs_dir2_leaf_tail_p(mp, leaf);
@@ -1864,10 +1863,10 @@ xfs_dir2_node_to_leaf(
  /*
  * Read the freespace block.
  */
- if ((error = xfs_da_read_buf(tp, dp, mp->m_dirfreeblk, -1, &fbp,
- XFS_DATA_FORK))) {
+ error = xfs_da_read_buf(tp, dp,  mp->m_dirfreeblk, -1, &fbp,
+ XFS_DATA_FORK, NULL);
+ if (error)
  return error;
- }
  free = fbp->b_addr;
  ASSERT(free->hdr.magic == cpu_to_be32(XFS_DIR2_FREE_MAGIC));
  ASSERT(!free->hdr.firstdb);
diff --git a/fs/xfs/xfs_dir2_node.c b/fs/xfs/xfs_dir2_node.c
index 6c70524..290c2b1 100644
--- a/fs/xfs/xfs_dir2_node.c
+++ b/fs/xfs/xfs_dir2_node.c
@@ -399,7 +399,7 @@ xfs_dir2_leafn_lookup_for_addname(
  */
  error = xfs_da_read_buf(tp, dp,
  xfs_dir2_db_to_da(mp, newfdb),
- -1, &curbp, XFS_DATA_FORK);
+ -1, &curbp, XFS_DATA_FORK, NULL);
  if (error)
  return error;
  free = curbp->b_addr;
@@ -536,7 +536,7 @@ xfs_dir2_leafn_lookup_for_entry(
  } else {
  error = xfs_da_read_buf(tp, dp,
  xfs_dir2_db_to_da(mp, newdb),
- -1, &curbp, XFS_DATA_FORK);
+ -1, &curbp, XFS_DATA_FORK, NULL);
  if (error)
  return error;
  }
@@ -915,10 +915,10 @@ xfs_dir2_leafn_remove(
  * read in the free block.
  */
  fdb = xfs_dir2_db_to_fdb(mp, db);
- if ((error = xfs_da_read_buf(tp, dp, xfs_dir2_db_to_da(mp, fdb),
- -1, &fbp, XFS_DATA_FORK))) {
+ error = xfs_da_read_buf(tp, dp, xfs_dir2_db_to_da(mp, fdb),
+ -1, &fbp, XFS_DATA_FORK, NULL);
+ if (error)
  return error;
- }
  free = fbp->b_addr;
  ASSERT(free->hdr.magic == cpu_to_be32(XFS_DIR2_FREE_MAGIC));
  ASSERT(be32_to_cpu(free->hdr.firstdb) ==
@@ -1169,11 +1169,10 @@ xfs_dir2_leafn_toosmall(
  /*
  * Read the sibling leaf block.
  */
- if ((error =
-    xfs_da_read_buf(state->args->trans, state->args->dp, blkno,
-    -1, &bp, XFS_DATA_FORK))) {
+ error = xfs_da_read_buf(state->args->trans, state->args->dp,
+ blkno, -1, &bp, XFS_DATA_FORK, NULL);
+ if (error)
  return error;
- }
  ASSERT(bp != NULL);
  /*
  * Count bytes in the two blocks combined.
@@ -1454,14 +1453,13 @@ xfs_dir2_node_addname_int(
  * This should be really rare, so there's no reason
  * to avoid it.
  */
- if ((error = xfs_da_read_buf(tp, dp,
- xfs_dir2_db_to_da(mp, fbno), -2, &fbp,
- XFS_DATA_FORK))) {
+ error = xfs_da_read_buf(tp, dp,
+ xfs_dir2_db_to_da(mp, fbno), -2,
+ &fbp, XFS_DATA_FORK, NULL);
+ if (error)
  return error;
- }
- if (unlikely(fbp == NULL)) {
+ if (!fbp)
  continue;
- }
  free = fbp->b_addr;
  ASSERT(free->hdr.magic == cpu_to_be32(XFS_DIR2_FREE_MAGIC));
  findex = 0;
@@ -1520,9 +1518,9 @@ xfs_dir2_node_addname_int(
  * that was just allocated.
  */
  fbno = xfs_dir2_db_to_fdb(mp, dbno);
- if (unlikely(error = xfs_da_read_buf(tp, dp,
- xfs_dir2_db_to_da(mp, fbno), -2, &fbp,
- XFS_DATA_FORK)))
+ error = xfs_da_read_buf(tp, dp, xfs_dir2_db_to_da(mp, fbno), -2,
+ &fbp, XFS_DATA_FORK, NULL);
+ if (error)
  return error;
 
  /*
@@ -1631,7 +1629,7 @@ xfs_dir2_node_addname_int(
  * Read the data block in.
  */
  error = xfs_da_read_buf(tp, dp, xfs_dir2_db_to_da(mp, dbno),
- -1, &dbp, XFS_DATA_FORK);
+ -1, &dbp, XFS_DATA_FORK, NULL);
  if (error)
  return error;
  hdr = dbp->b_addr;
@@ -1917,11 +1915,10 @@ xfs_dir2_node_trim_free(
  /*
  * Read the freespace block.
  */
- if (unlikely(error = xfs_da_read_buf(tp, dp, (xfs_dablk_t)fo, -2, &bp,
- XFS_DATA_FORK))) {
+ error = xfs_da_read_buf(tp, dp, (xfs_dablk_t)fo, -2, &bp,
+ XFS_DATA_FORK, NULL);
+ if (error)
  return error;
- }
-
  /*
  * There can be holes in freespace.  If fo is a hole, there's
  * nothing to do.
diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index c42f99e..f6dab7d 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -891,7 +891,7 @@ xfs_dir_open(
  */
  mode = xfs_ilock_map_shared(ip);
  if (ip->i_d.di_nextents > 0)
- xfs_da_reada_buf(NULL, ip, 0, XFS_DATA_FORK);
+ xfs_da_reada_buf(NULL, ip, 0, XFS_DATA_FORK, NULL);
  xfs_iunlock(ip, mode);
  return 0;
 }
--
1.7.10

_______________________________________________
xfs mailing list
[hidden email]
http://oss.sgi.com/mailman/listinfo/xfs
Reply | Threaded
Open this post in threaded view
|

[PATCH 19/32] xfs: factor dir2 block read operations

Dave Chinner
In reply to this post by Dave Chinner
From: Dave Chinner <[hidden email]>

In preparation for verifying dir2 block format buffers, factor
the read operations out of the block operations (lookup, addname,
getdents) and some of the additional logic to make it easier to
understand an dmodify the code.

Signed-off-by: Dave Chinner <[hidden email]>
---
 fs/xfs/xfs_dir2_block.c |  386 +++++++++++++++++++++++++----------------------
 1 file changed, 209 insertions(+), 177 deletions(-)

diff --git a/fs/xfs/xfs_dir2_block.c b/fs/xfs/xfs_dir2_block.c
index 53666ca..25ce409 100644
--- a/fs/xfs/xfs_dir2_block.c
+++ b/fs/xfs/xfs_dir2_block.c
@@ -56,6 +56,178 @@ xfs_dir_startup(void)
  xfs_dir_hash_dotdot = xfs_da_hashname((unsigned char *)"..", 2);
 }
 
+static int
+xfs_dir2_block_read(
+ struct xfs_trans *tp,
+ struct xfs_inode *dp,
+ struct xfs_buf **bpp)
+{
+ struct xfs_mount *mp = dp->i_mount;
+
+ return xfs_da_read_buf(tp, dp, mp->m_dirdatablk, -1, bpp,
+ XFS_DATA_FORK, NULL);
+}
+
+static void
+xfs_dir2_block_need_space(
+ struct xfs_dir2_data_hdr *hdr,
+ struct xfs_dir2_block_tail *btp,
+ struct xfs_dir2_leaf_entry *blp,
+ __be16 **tagpp,
+ struct xfs_dir2_data_unused **dupp,
+ struct xfs_dir2_data_unused **enddupp,
+ int *compact,
+ int len)
+{
+ struct xfs_dir2_data_free *bf;
+ __be16 *tagp = NULL;
+ struct xfs_dir2_data_unused *dup = NULL;
+ struct xfs_dir2_data_unused *enddup = NULL;
+
+ *compact = 0;
+ bf = hdr->bestfree;
+
+ /*
+ * If there are stale entries we'll use one for the leaf.
+ */
+ if (btp->stale) {
+ if (be16_to_cpu(bf[0].length) >= len) {
+ /*
+ * The biggest entry enough to avoid compaction.
+ */
+ dup = (xfs_dir2_data_unused_t *)
+      ((char *)hdr + be16_to_cpu(bf[0].offset));
+ goto out;
+ }
+
+ /*
+ * Will need to compact to make this work.
+ * Tag just before the first leaf entry.
+ */
+ *compact = 1;
+ tagp = (__be16 *)blp - 1;
+
+ /* Data object just before the first leaf entry.  */
+ dup = (xfs_dir2_data_unused_t *)((char *)hdr + be16_to_cpu(*tagp));
+
+ /*
+ * If it's not free then the data will go where the
+ * leaf data starts now, if it works at all.
+ */
+ if (be16_to_cpu(dup->freetag) == XFS_DIR2_DATA_FREE_TAG) {
+ if (be16_to_cpu(dup->length) + (be32_to_cpu(btp->stale) - 1) *
+    (uint)sizeof(*blp) < len)
+ dup = NULL;
+ } else if ((be32_to_cpu(btp->stale) - 1) * (uint)sizeof(*blp) < len)
+ dup = NULL;
+ else
+ dup = (xfs_dir2_data_unused_t *)blp;
+ goto out;
+ }
+
+ /*
+ * no stale entries, so just use free space.
+ * Tag just before the first leaf entry.
+ */
+ tagp = (__be16 *)blp - 1;
+
+ /* Data object just before the first leaf entry.  */
+ enddup = (xfs_dir2_data_unused_t *)((char *)hdr + be16_to_cpu(*tagp));
+
+ /*
+ * If it's not free then can't do this add without cleaning up:
+ * the space before the first leaf entry needs to be free so it
+ * can be expanded to hold the pointer to the new entry.
+ */
+ if (be16_to_cpu(enddup->freetag) == XFS_DIR2_DATA_FREE_TAG) {
+ /*
+ * Check out the biggest freespace and see if it's the same one.
+ */
+ dup = (xfs_dir2_data_unused_t *)
+      ((char *)hdr + be16_to_cpu(bf[0].offset));
+ if (dup != enddup) {
+ /*
+ * Not the same free entry, just check its length.
+ */
+ if (be16_to_cpu(dup->length) < len)
+ dup = NULL;
+ goto out;
+ }
+
+ /*
+ * It is the biggest freespace, can it hold the leaf too?
+ */
+ if (be16_to_cpu(dup->length) < len + (uint)sizeof(*blp)) {
+ /*
+ * Yes, use the second-largest entry instead if it works.
+ */
+ if (be16_to_cpu(bf[1].length) >= len)
+ dup = (xfs_dir2_data_unused_t *)
+      ((char *)hdr + be16_to_cpu(bf[1].offset));
+ else
+ dup = NULL;
+ }
+ }
+out:
+ *tagpp = tagp;
+ *dupp = dup;
+ *enddupp = enddup;
+}
+
+/*
+ * compact the leaf entries.
+ * Leave the highest-numbered stale entry stale.
+ * XXX should be the one closest to mid but mid is not yet computed.
+ */
+static void
+xfs_dir2_block_compact(
+ struct xfs_trans *tp,
+ struct xfs_buf *bp,
+ struct xfs_dir2_data_hdr *hdr,
+ struct xfs_dir2_block_tail *btp,
+ struct xfs_dir2_leaf_entry *blp,
+ int *needlog,
+ int *lfloghigh,
+ int *lfloglow)
+{
+ int fromidx; /* source leaf index */
+ int toidx; /* target leaf index */
+ int needscan = 0;
+ int highstale; /* high stale index */
+
+ fromidx = toidx = be32_to_cpu(btp->count) - 1;
+ highstale = *lfloghigh = -1;
+ for (; fromidx >= 0; fromidx--) {
+ if (blp[fromidx].address == cpu_to_be32(XFS_DIR2_NULL_DATAPTR)) {
+ if (highstale == -1)
+ highstale = toidx;
+ else {
+ if (*lfloghigh == -1)
+ *lfloghigh = toidx;
+ continue;
+ }
+ }
+ if (fromidx < toidx)
+ blp[toidx] = blp[fromidx];
+ toidx--;
+ }
+ *lfloglow = toidx + 1 - (be32_to_cpu(btp->stale) - 1);
+ *lfloghigh -= be32_to_cpu(btp->stale) - 1;
+ be32_add_cpu(&btp->count, -(be32_to_cpu(btp->stale) - 1));
+ xfs_dir2_data_make_free(tp, bp,
+ (xfs_dir2_data_aoff_t)((char *)blp - (char *)hdr),
+ (xfs_dir2_data_aoff_t)((be32_to_cpu(btp->stale) - 1) * sizeof(*blp)),
+ needlog, &needscan);
+ blp += be32_to_cpu(btp->stale) - 1;
+ btp->stale = cpu_to_be32(1);
+ /*
+ * If we now need to rebuild the bestfree map, do so.
+ * This needs to happen before the next call to use_free.
+ */
+ if (needscan)
+ xfs_dir2_data_freescan(tp->t_mountp, hdr, needlog);
+}
+
 /*
  * Add an entry to a block directory.
  */
@@ -63,7 +235,6 @@ int /* error */
 xfs_dir2_block_addname(
  xfs_da_args_t *args) /* directory op arguments */
 {
- xfs_dir2_data_free_t *bf; /* bestfree table in block */
  xfs_dir2_data_hdr_t *hdr; /* block header */
  xfs_dir2_leaf_entry_t *blp; /* block leaf entries */
  struct xfs_buf *bp; /* buffer for block */
@@ -94,134 +265,44 @@ xfs_dir2_block_addname(
  dp = args->dp;
  tp = args->trans;
  mp = dp->i_mount;
- /*
- * Read the (one and only) directory block into dabuf bp.
- */
- error = xfs_da_read_buf(tp, dp, mp->m_dirdatablk, -1, &bp,
- XFS_DATA_FORK, NULL);
+
+ /* Read the (one and only) directory block into bp. */
+ error = xfs_dir2_block_read(tp, dp, &bp);
  if (error)
  return error;
- ASSERT(bp != NULL);
- hdr = bp->b_addr;
- /*
- * Check the magic number, corrupted if wrong.
- */
- if (unlikely(hdr->magic != cpu_to_be32(XFS_DIR2_BLOCK_MAGIC))) {
- XFS_CORRUPTION_ERROR("xfs_dir2_block_addname",
-     XFS_ERRLEVEL_LOW, mp, hdr);
- xfs_trans_brelse(tp, bp);
- return XFS_ERROR(EFSCORRUPTED);
- }
+
  len = xfs_dir2_data_entsize(args->namelen);
+
  /*
  * Set up pointers to parts of the block.
  */
- bf = hdr->bestfree;
+ hdr = bp->b_addr;
  btp = xfs_dir2_block_tail_p(mp, hdr);
  blp = xfs_dir2_block_leaf_p(btp);
+
  /*
- * No stale entries?  Need space for entry and new leaf.
- */
- if (!btp->stale) {
- /*
- * Tag just before the first leaf entry.
- */
- tagp = (__be16 *)blp - 1;
- /*
- * Data object just before the first leaf entry.
- */
- enddup = (xfs_dir2_data_unused_t *)((char *)hdr + be16_to_cpu(*tagp));
- /*
- * If it's not free then can't do this add without cleaning up:
- * the space before the first leaf entry needs to be free so it
- * can be expanded to hold the pointer to the new entry.
- */
- if (be16_to_cpu(enddup->freetag) != XFS_DIR2_DATA_FREE_TAG)
- dup = enddup = NULL;
- /*
- * Check out the biggest freespace and see if it's the same one.
- */
- else {
- dup = (xfs_dir2_data_unused_t *)
-      ((char *)hdr + be16_to_cpu(bf[0].offset));
- if (dup == enddup) {
- /*
- * It is the biggest freespace, is it too small
- * to hold the new leaf too?
- */
- if (be16_to_cpu(dup->length) < len + (uint)sizeof(*blp)) {
- /*
- * Yes, we use the second-largest
- * entry instead if it works.
- */
- if (be16_to_cpu(bf[1].length) >= len)
- dup = (xfs_dir2_data_unused_t *)
-      ((char *)hdr +
-       be16_to_cpu(bf[1].offset));
- else
- dup = NULL;
- }
- } else {
- /*
- * Not the same free entry,
- * just check its length.
- */
- if (be16_to_cpu(dup->length) < len) {
- dup = NULL;
- }
- }
- }
- compact = 0;
- }
- /*
- * If there are stale entries we'll use one for the leaf.
- * Is the biggest entry enough to avoid compaction?
- */
- else if (be16_to_cpu(bf[0].length) >= len) {
- dup = (xfs_dir2_data_unused_t *)
-      ((char *)hdr + be16_to_cpu(bf[0].offset));
- compact = 0;
- }
- /*
- * Will need to compact to make this work.
+ * Find out if we can reuse stale entries or whether we need extra
+ * space for entry and new leaf.
  */
- else {
- /*
- * Tag just before the first leaf entry.
- */
- tagp = (__be16 *)blp - 1;
- /*
- * Data object just before the first leaf entry.
- */
- dup = (xfs_dir2_data_unused_t *)((char *)hdr + be16_to_cpu(*tagp));
- /*
- * If it's not free then the data will go where the
- * leaf data starts now, if it works at all.
- */
- if (be16_to_cpu(dup->freetag) == XFS_DIR2_DATA_FREE_TAG) {
- if (be16_to_cpu(dup->length) + (be32_to_cpu(btp->stale) - 1) *
-    (uint)sizeof(*blp) < len)
- dup = NULL;
- } else if ((be32_to_cpu(btp->stale) - 1) * (uint)sizeof(*blp) < len)
- dup = NULL;
- else
- dup = (xfs_dir2_data_unused_t *)blp;
- compact = 1;
- }
+ xfs_dir2_block_need_space(hdr, btp, blp, &tagp, &dup,
+  &enddup, &compact, len);
+
  /*
- * If this isn't a real add, we're done with the buffer.
+ * Done everything we need for a space check now.
  */
- if (args->op_flags & XFS_DA_OP_JUSTCHECK)
+ if (args->op_flags & XFS_DA_OP_JUSTCHECK) {
  xfs_trans_brelse(tp, bp);
+ if (!dup)
+ return XFS_ERROR(ENOSPC);
+ return 0;
+ }
+
  /*
  * If we don't have space for the new entry & leaf ...
  */
  if (!dup) {
- /*
- * Not trying to actually do anything, or don't have
- * a space reservation: return no-space.
- */
- if ((args->op_flags & XFS_DA_OP_JUSTCHECK) || args->total == 0)
+ /* Don't have a space reservation: return no-space.  */
+ if (args->total == 0)
  return XFS_ERROR(ENOSPC);
  /*
  * Convert to the next larger format.
@@ -232,65 +313,24 @@ xfs_dir2_block_addname(
  return error;
  return xfs_dir2_leaf_addname(args);
  }
- /*
- * Just checking, and it would work, so say so.
- */
- if (args->op_flags & XFS_DA_OP_JUSTCHECK)
- return 0;
+
  needlog = needscan = 0;
+
  /*
  * If need to compact the leaf entries, do it now.
- * Leave the highest-numbered stale entry stale.
- * XXX should be the one closest to mid but mid is not yet computed.
- */
- if (compact) {
- int fromidx; /* source leaf index */
- int toidx; /* target leaf index */
-
- for (fromidx = toidx = be32_to_cpu(btp->count) - 1,
- highstale = lfloghigh = -1;
-     fromidx >= 0;
-     fromidx--) {
- if (blp[fromidx].address ==
-    cpu_to_be32(XFS_DIR2_NULL_DATAPTR)) {
- if (highstale == -1)
- highstale = toidx;
- else {
- if (lfloghigh == -1)
- lfloghigh = toidx;
- continue;
- }
- }
- if (fromidx < toidx)
- blp[toidx] = blp[fromidx];
- toidx--;
- }
- lfloglow = toidx + 1 - (be32_to_cpu(btp->stale) - 1);
- lfloghigh -= be32_to_cpu(btp->stale) - 1;
- be32_add_cpu(&btp->count, -(be32_to_cpu(btp->stale) - 1));
- xfs_dir2_data_make_free(tp, bp,
- (xfs_dir2_data_aoff_t)((char *)blp - (char *)hdr),
- (xfs_dir2_data_aoff_t)((be32_to_cpu(btp->stale) - 1) * sizeof(*blp)),
- &needlog, &needscan);
- blp += be32_to_cpu(btp->stale) - 1;
- btp->stale = cpu_to_be32(1);
- /*
- * If we now need to rebuild the bestfree map, do so.
- * This needs to happen before the next call to use_free.
- */
- if (needscan) {
- xfs_dir2_data_freescan(mp, hdr, &needlog);
- needscan = 0;
- }
- }
- /*
- * Set leaf logging boundaries to impossible state.
- * For the no-stale case they're set explicitly.
  */
+ if (compact)
+ xfs_dir2_block_compact(tp, bp, hdr, btp, blp, &needlog,
+      &lfloghigh, &lfloglow);
  else if (btp->stale) {
+ /*
+ * Set leaf logging boundaries to impossible state.
+ * For the no-stale case they're set explicitly.
+ */
  lfloglow = be32_to_cpu(btp->count);
  lfloghigh = -1;
  }
+
  /*
  * Find the slot that's first lower than our hash value, -1 if none.
  */
@@ -450,18 +490,13 @@ xfs_dir2_block_getdents(
  /*
  * If the block number in the offset is out of range, we're done.
  */
- if (xfs_dir2_dataptr_to_db(mp, *offset) > mp->m_dirdatablk) {
+ if (xfs_dir2_dataptr_to_db(mp, *offset) > mp->m_dirdatablk)
  return 0;
- }
- /*
- * Can't read the block, give up, else get dabuf in bp.
- */
- error = xfs_da_read_buf(NULL, dp, mp->m_dirdatablk, -1,
- &bp, XFS_DATA_FORK, NULL);
+
+ error = xfs_dir2_block_read(NULL, dp, &bp);
  if (error)
  return error;
 
- ASSERT(bp != NULL);
  /*
  * Extract the byte offset we start at from the seek pointer.
  * We'll skip entries before this.
@@ -637,14 +672,11 @@ xfs_dir2_block_lookup_int(
  dp = args->dp;
  tp = args->trans;
  mp = dp->i_mount;
- /*
- * Read the buffer, return error if we can't get it.
- */
- error = xfs_da_read_buf(tp, dp, mp->m_dirdatablk, -1, &bp,
- XFS_DATA_FORK, NULL);
+
+ error = xfs_dir2_block_read(tp, dp, &bp);
  if (error)
  return error;
- ASSERT(bp != NULL);
+
  hdr = bp->b_addr;
  xfs_dir2_data_check(dp, bp);
  btp = xfs_dir2_block_tail_p(mp, hdr);
--
1.7.10

_______________________________________________
xfs mailing list
[hidden email]
http://oss.sgi.com/mailman/listinfo/xfs
12345